Skip to content

Commit f1ad9ad

Browse files
galvezAndyGauge
authored andcommitted
add unicode-segmentation example (#517)
Thanks!
1 parent 5824ee2 commit f1ad9ad

File tree

6 files changed

+34
-2
lines changed

6 files changed

+34
-2
lines changed

Cargo.toml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ publish = false
88
build = "build.rs"
99

1010
[dependencies]
11+
ansi_term = "0.11.0"
1112
base64 = "0.9"
1213
bitflags = "1.0"
1314
byteorder = "1.0"
@@ -27,6 +28,7 @@ log = "0.4"
2728
log4rs = "0.8"
2829
memmap = "0.7"
2930
mime = "0.3"
31+
nalgebra = "0.16.12"
3032
ndarray = "0.12"
3133
num = "0.2"
3234
num_cpus = "1.8"
@@ -48,10 +50,9 @@ tar = "0.4.12"
4850
tempdir = "0.3.5"
4951
threadpool = "1.6"
5052
toml = "0.4"
53+
unicode-segmentation = "1.2.1"
5154
url = "1.6"
5255
walkdir = "2.0"
53-
ansi_term = "0.11.0"
54-
nalgebra = "0.16.12"
5556

5657
[target.'cfg(target_os = "linux")'.dependencies]
5758
syslog = "4.0"

ci/dictionary.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,8 @@ GitHub
112112
github
113113
GlobError
114114
Guybrush
115+
graphemes
116+
Graphemes
115117
GzDecoder
116118
GzEncoder
117119
Hackerman
@@ -311,6 +313,7 @@ Tuple
311313
typesafe
312314
unary
313315
unix
316+
unicode
314317
unwinded
315318
UpperHex
316319
uptime

src/links.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,5 +138,7 @@ Keep lines sorted.
138138
[toml]: https://docs.rs/toml/
139139
[url-badge]: https://badge-cache.kominick.com/crates/v/url.svg?label=url
140140
[url]: https://docs.rs/url/
141+
[unicode-segmentation-badge]: https://badge-cache.kominick.com/crates/v/unicode-segmentation.svg?label=unicode-segmentation
142+
[unicode-segmentation]: https://docs.rs/unicode-segmentation/
141143
[walkdir-badge]: https://badge-cache.kominick.com/crates/v/walkdir.svg?label=walkdir
142144
[walkdir]: https://docs.rs/walkdir/

src/text.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
| Recipe | Crates | Categories |
44
|--------|--------|------------|
5+
| [Collect Unicode Graphemes][ex-unicode-graphemes] | [![unicode-segmentation-badge]][unicode-segmentation] | [![cat-encoding-badge]][cat-text-processing] |
56
| [Verify and extract login from an email address][ex-verify-extract-email] | [![regex-badge]][regex] [![lazy_static-badge]][lazy_static] | [![cat-text-processing-badge]][cat-text-processing] |
67
| [Extract a list of unique #Hashtags from a text][ex-extract-hashtags] | [![regex-badge]][regex] [![lazy_static-badge]][lazy_static] | [![cat-text-processing-badge]][cat-text-processing] |
78
| [Extract phone numbers from text][ex-phone] | [![regex-badge]][regex] | [![cat-text-processing-badge]][cat-text-processing] |
@@ -15,6 +16,7 @@
1516
[ex-regex-filter-log]: text/regex.html#filter-a-log-file-by-matching-multiple-regular-expressions
1617
[ex-regex-replace-named]: text/regex.html#replace-all-occurrences-of-one-text-pattern-with-another-pattern
1718

19+
[ex-unicode-graphemes]: text/string_parsing.html#collect-unicode-graphemes
1820
[string_parsing-from_str]: text/string_parsing.html#implement-the-fromstr-trait-for-a-custom-struct
1921

2022
{{#include links.md}}

src/text/string_parsing.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# String Parsing
22

3+
{{#include string_parsing/graphemes.md}}
4+
35
{{#include string_parsing/from_str.md}}
46

57
{{#include ../links.md}}

src/text/string_parsing/graphemes.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
## Collect Unicode Graphemes
2+
3+
[![unicode-segmentation-badge]][`unicode-segmentation`] [![cat-text-processing-badge]][cat-text-processing]
4+
5+
Collect individual Unicode graphemes from UTF-8 string using the
6+
[`UnicodeSegmentation::graphemes`] function from the [`unicode-segmentation`] crate.
7+
8+
```rust
9+
#[macro_use]
10+
extern crate unicode_segmentation;
11+
use unicode_segmentation::UnicodeSegmentation;
12+
13+
fn main() {
14+
let name = "José Guimarães\r\n";
15+
let graphemes = UnicodeSegmentation::graphemes(name, true)
16+
.collect::<Vec<&str>>();
17+
assert_eq!(graphemes[3], "é");
18+
}
19+
```
20+
21+
[`UnicodeSegmentation::graphemes`]: https://docs.rs/unicode-segmentation/*/unicode_segmentation/trait.UnicodeSegmentation.html#tymethod.graphemes
22+
[`unicode-segmentation`]: https://docs.rs/unicode-segmentation/1.2.1/unicode_segmentation/

0 commit comments

Comments
 (0)