Skip to content

Commit

Permalink
Update example & readme & license
Browse files Browse the repository at this point in the history
  • Loading branch information
wspl committed Feb 17, 2017
1 parent 971e123 commit 7908859
Show file tree
Hide file tree
Showing 4 changed files with 112 additions and 1 deletion.
14 changes: 14 additions & 0 deletions License
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Copyright (c) 2017 Plutonist
All rights reserved.

Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are
duplicated in all such forms and that any documentation,
advertising materials, and other materials related to such
distribution and use acknowledge that the software was developed
by the Plutonist. The name of the
Plutonist may not be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
97 changes: 97 additions & 0 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,101 @@ link: https://reactos.org/project-news/reactos-044-released
title: FeFETs: How this new memory stacks up against existing non-volatile memory
site: semiengineering.com
link: http://semiengineering.com/what-are-fefets/
```

## Script Spec

### Town

Town is a lambda like expression for saving (in)mutable string. Most of the time, we used it to store url.

```
page(@page=1, ext) = "https://news.ycombinator.com/news?p={@page}&ext={ext}"
```

When you need town, use it as if you were calling a function:

```
news[]: page(ext="Hello World!") -> $("tr.athing")
```

Hey, you might have noticed that the `@page` parameter is not used. Yeah, it is a special parameter.

Expression in town definition line like `name="something"`, represents parameter `name` has a default value `"something"`.

Incidentally, `@page` is a parameter that will automatically increasing when current page has no more content.


### Node

Nodes is a tree structure that represents the data structure you are going to crawl.

```
news[]: page -> $("tr.athing")
title: $(".title a.storylink").text
site: $(".title span.sitestr").text
link: $(".title a.storylink").href
```

Like `yaml`, nodes distinguishes the hierarchy by indentation.

#### Node Name

Node has name. `title` is a field name, represents a general string data. `news[]` is a array name, represents a parent structure with multiple sub-data.

#### Page

Page indicates where to fetching the field data. It can be a town expression or field reference.

Field reference is a advanced usage of Node, you can found the details in [./eh.crs](./eh.crs).

If a node owned page and fun at the same time, page should on the left of `->`, fun should on the right of `->`. Which is `page -> fun`

#### Fun

Fun represents the data processing process.

There are all supported funs:

| Name | Parameters | Description |
| --------- | -------------------------------- | ---------------------------------------- |
| $ | (selector: string) | CSS selector |
| html | | inner HTML |
| text | | inner text |
| outerHTML | | outer HTML |
| attr | (attr: string) | attribute value |
| style | | style attribute value |
| href | | href attribute value |
| src | | src attribute value |
| calc | (prec: int) | calculate arithmetic expression |
| match | (regexp: string) | match first sub-string via regular expression |
| expand | (regexp: string, target: string) | expand matched strings to target string |



## Author

Plutonist

> [impl.moe](impl.moe) · Github [@wspl](impl.moe)


## License

```
Copyright (c) 2017 Plutonist
All rights reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are
duplicated in all such forms and that any documentation,
advertising materials, and other materials related to such
distribution and use acknowledge that the software was developed
by the Plutonist. The name of the
Plutonist may not be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
```
File renamed without changes.
2 changes: 1 addition & 1 deletion main/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import (
)

func main() {
//buf, _ := ioutil.ReadFile("./eh.crr")
//buf, _ := ioutil.ReadFile("./eh.crs")
//raw := string(buf)
//c := New(raw)
//c.Array("gallery").Each(func(c *Creeper) {
Expand Down

0 comments on commit 7908859

Please sign in to comment.