Skip to content

Commit e36ac82

Browse files
authored
Snapshot 1.0.2 (apache#148)
* GitBook: [apache#2] No subject * GitBook: [apache#3] No subject * SDK增加udaf的支持 * SDK增加UDAF的支持 * SDK增加udaf的支持 * SDK增加udaf的支持
1 parent d365063 commit e36ac82

File tree

83 files changed

+776
-1327
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+776
-1327
lines changed

README-chinese.md

Lines changed: 0 additions & 150 deletions
This file was deleted.

README.md

Lines changed: 5 additions & 141 deletions
Original file line numberDiff line numberDiff line change
@@ -1,142 +1,6 @@
1-
# RocketMQ Streams [![Build Status](https://app.travis-ci.com/apache/rocketmq-streams.svg?branch=main)](https://app.travis-ci.com/apache/rocketmq-streams) [![CodeCov](https://codecov.io/gh/apache/rocketmq-stream/branch/main/graph/badge.svg)](https://app.codecov.io/gh/apache/rocketmq-streams)
1+
# Summary
22

3-
[![GitHub release](https://img.shields.io/badge/release-download-orange.svg)](https://github.com/apache/rocketmq-streams/releases)
4-
[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
5-
[![Average time to resolve an issue](http://isitmaintained.com/badge/resolution/apache/rocketmq-streams.svg)](http://isitmaintained.com/project/apache/rocketmq-streams "Average time to resolve an issue")
6-
[![Percentage of issues still open](http://isitmaintained.com/badge/open/apache/rocketmq-streams.svg)](http://isitmaintained.com/project/apache/rocketmq-streams "Percentage of issues still open")
7-
[![Twitter Follow](https://img.shields.io/twitter/follow/ApacheRocketMQ?style=social)](https://twitter.com/intent/follow?screen_name=ApacheRocketMQ)
8-
9-
## [中文文档](./README-Chinese.md)
10-
11-
## [Quick Start](./quick_start.md)
12-
13-
## Features
14-
15-
* Lightweight deployment: RocketMQ Streams can be deployed separately or in cluster mode.
16-
* Various types of data input and output: source supports RocketMQ while sink supports databases and RocketMQ, etc.
17-
18-
## DataStream Example
19-
20-
```java
21-
import org.apache.rocketmq.streams.client.transform.DataStream;
22-
23-
DataStreamSource source=StreamBuilder.dataStream("namespace","pipeline");
24-
source
25-
.fromFile("~/admin/data/text.txt",false)
26-
.map(message->message)
27-
.toPrint(1)
28-
.start();
29-
```
30-
31-
## Maven Repository
32-
33-
```xml
34-
35-
<dependency>
36-
<groupId>org.apache.rocketmq</groupId>
37-
<artifactId>rocketmq-streams-clients</artifactId>
38-
<version>1.0.2-SNAPSHOT</version>
39-
</dependency>
40-
```
41-
42-
# Core API
43-
44-
RocketMQ Streams implements a series of advanced APIs, allowing users to write stream computing programs conveniently and achieve their own business requirements.
45-
46-
## StreamBuilder
47-
48-
StreamBuilder is used to build the source of stream tasks. It contains two methods: ```dataStream()``` and ```tableStream()```, which return two sources, DataStreamSource and TableStreamSource, respectively.
49-
50-
+ [dataStream(nameSpaceName,pipelineName)]() returns an instance of DataStreamSource, used for segmented programming to achieve stream computing tasks.
51-
+ [tableStream(nameSpaceName,pipelineName)]() returns an instance of TableStreamSource, used for script programming to achieve stream computing tasks.
52-
53-
## DataStream API
54-
55-
### Source
56-
57-
DataStreamSource is a source class of segmented programming, used to interface with various data sources and obtain data from major message queues.
58-
59-
+ ```fromFile```: reads data from the file. This method contains two parameters:
60-
+ ```filePath```: specifies which file path to read. Required.
61-
+ ```isJsonData```: specifies whether data is in JSON format. Optional. Default value: ```true```.
62-
+ ```tags```: the tags for filtering messages used by the RocketMQ consumer. Optional.
63-
64-
65-
+ ```fromRocketmq```: obtains data from RocketMQ, including four parameters:
66-
+ ```topic```: the topic name of RocketMQ. Required.
67-
+ ```groupName```: the name of the consumer group. Required.
68-
+ ```isJson```: specifies whether data is in JSON format. Optional.
69-
+ ```tags```: the tags for filtering messages used by the RocketMQ consumer. Optional.
70-
71-
+ ```fromKafka``` read data from the Kafka, including five parameters:
72-
+ ```bootstrapserver``` the Kafka bootstrap servers. Required.
73-
+ ```topic``` the topic name of Kafka. Required.
74-
+ ```groupName``` the name of the consumer group. Required.
75-
+ ```isJson``` specifies whether data is in JSON format. Optional.
76-
+ ```maxThread``` the number of the Kafka consumer max Threads.Optional.
77-
78-
+ ```fromMqtt``` reads data from MQTT service, including nine parameters:
79-
+ ```url``` the broker of the MQTT service. Required.
80-
+ ```clientId``` the client id. Required
81-
+ ```topic``` the name of the MQTT topic. Required.
82-
+ ```username``` username. Optional.
83-
+ ```password``` password. Optional.
84-
+ ```cleanSession``` specifies Whether to clear the session during the restart. Optional.
85-
+ ```connectionTimeout``` the connection timeout. Optional.
86-
+ ```aliveInterval``` Survival time interval. Optional.
87-
+ ```automaticReconnect``` specifies Whether to reconnect. Optional.
88-
89-
+ ```from```: custom data source. You can specify your own data source by implementing ISource interface.
90-
91-
### transform
92-
93-
transform allows the input source data to be modified during the stream calculation process for the next step; DataStream API includes ```DataStream```, ```JoinStream```, ```SplitStream```, ```WindowStream```, and many other transform classes.
94-
95-
#### DataStream
96-
97-
DataStream implements a series of common stream calculation operators as follows:
98-
99-
+ ```map```: returns a new DataStream by passing each record of the source to the **func** function.
100-
+ ```flatmap```: similar to map. One input item corresponds to 0 or more output items.
101-
+ ```filter```: returns a new DataStream based on the record of the source DataStream only when the ** func** function returns **true**.
102-
+ ```forEach```: executes the **func** function once for each record and returns a new DataStream.
103-
+ ```selectFields```: returns the corresponding field value for each record, and returns a new DataStream.
104-
+ ```operate```: executes a custom function for each record and returns a new DataStream.
105-
+ ```script```: executes a script for each recorded field, returns new fields, and generates a new DataStream.
106-
+ ```toPrint```: prints the result on the console and generates a new DataStreamAction instance.
107-
+ ```toFile```: saves the result as a file and generates a new DataStreamAction instance.
108-
+ ```toDB```: saves the result to the database.
109-
+ ```toRocketmq```: outputs the result to RocketMQ.
110-
+ ```toKafka```: outputs the result to Kafka.
111-
+ ```to```: outputs the result to the specified storage through the custom ISink interface.
112-
+ ```window```: performs relevant statistical analysis in the window, generally used in conjunction with ```groupBy```. ```window()``` is used to define the size of the window, and ```groupBy( )``` used to define the main key of statistical analysis. You can specify multiple main keys:
113-
+ ```count```: counts in the window.
114-
+ ```min```: gets the minimum of the statistical value in the window.
115-
+ ```max```: gets the maximum of the statistical value in the window.
116-
+ ```avg```: gets the average of the statistical values in the window.
117-
+ ```sum```: gets the sum of the statistical values in the window.
118-
+ ```reduce```: performs custom summary calculations in the window.
119-
+ ```join```: associates the two streams or one stream and one physical table according to the conditions and merges them into a large stream for related calculations.
120-
+ ```dimJoin``` associate a stream with a physical table which can be a file or a db table, and all matching records are retained
121-
+ ```dimLeftJoin``` After a flow is associated with a physical table, all data of the flow is reserved and fields that do not match the physical table are left blank
122-
+ ```join```
123-
+ ```leftJoin```
124-
+ ```union```: merges the two streams.
125-
+ ```split```: splits a data stream into different data streams according to tags for downstream analysis and calculation.
126-
+ ```with```: specifies related strategies during the calculation, including Checkpoint and state storage strategies, etc.
127-
128-
# Strategy
129-
130-
The Strategy mechanism is mainly used to control the underlying logic during the operation of the computing engine, such as the storage methods of Checkpoint and state etc. Subsequent controls for windows, dual-stream joins, and so on will be added. All control strategies are transmitted through the ```with``` operator. Multiple policy types can be transmitted at the same time.
131-
132-
```java
133-
//Specify the storage strategy for Checkpoint.
134-
source
135-
.fromRocketmq("TSG_META_INFO","")
136-
.map(message->message+"--")
137-
.toPrint(1)
138-
.with(CheckpointStrategy.db("jdbc:mysql://XXXXX:3306/XXXXX","","",0L))
139-
.start();
140-
```
141-
142-
——————————————
3+
* [Quick Start](quick\_start.md)
4+
* [创建实时任务数据源](stream\_source.md)
5+
* [创建实时任务数据输出](stream\_sink.md)
6+
* [数据处理逻辑](stream\_transform.md)

SUMMARY.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Table of contents
2+
3+
* [Summary](README.md)
4+
* [快速开发](quick\_start.md)
5+
* [stream\_source](stream\_source.md)
6+
* [stream\_sink](stream\_sink.md)
7+
* [数据处理逻辑](stream\_transform.md)

0 commit comments

Comments
 (0)