This is an example Swift stream processing application to show how to count words with Apache Spark Connect Swift Client library.
docker run --rm -p 15002:15002 apache/spark:4.1.1 bash -c "/opt/spark/sbin/start-connect-server.sh --wait -c spark.log.level=ERROR"You will first need to run Netcat (a small utility found in most Unix-like systems) as a data server by using
nc -lk 9999Build an application Docker image.
$ docker build -t apache/spark-connect-swift:stream .
$ docker images apache/spark-connect-swift:stream
IMAGE ID DISK USAGE CONTENT SIZE EXTRA
apache/spark-connect-swift:stream 683d4bd67cec 550MB 128MBRun stream docker image.
docker run --rm -e SPARK_REMOTE=sc://host.docker.internal:15002 -e TARGET_HOST=host.docker.internal apache/spark-connect-swift:streamThen, any lines typed in the terminal running the Netcat server will be counted and printed on screen every second.
$ nc -lk 9999
apache spark
apache hadoopSpark Connect Server output will look something like the following.
-------------------------------------------
Batch: 0
-------------------------------------------
+----+--------+
|word|count(1)|
+----+--------+
+----+--------+
-------------------------------------------
Batch: 1
-------------------------------------------
+------+--------+
| word|count(1)|
+------+--------+
|apache| 1|
| spark| 1|
+------+--------+
-------------------------------------------
Batch: 2
-------------------------------------------
+------+--------+
| word|count(1)|
+------+--------+
|apache| 2|
| spark| 1|
|hadoop| 1|
+------+--------+$ TARGET_HOST=host.docker.internal swift run
...
Connected to Apache Spark 4.1.1 Server