> Spark Structured Streaming
Real-time stream processing from Kafka, files, and sockets with Spark.
fetch
$
curl "https://skillshub.wtf/skillshub-team/catalog-batch5/spark-streaming?format=md"SKILL.md•Spark Structured Streaming
Spark Streaming
df = spark.readStream.format("kafka") \
.option("kafka.bootstrap.servers", "localhost:9092") \
.option("subscribe", "events").load()
events = df.selectExpr("CAST(value AS STRING)") \
.select(F.from_json(F.col("value"), schema).alias("d")).select("d.*")
# Windowed aggregation
windowed = events.withWatermark("ts", "10 min") \
.groupBy(F.window("ts", "5 min"), "userId") \
.agg(F.count("*").alias("cnt"))
query = windowed.writeStream.outputMode("update") \
.format("console").start()
query.awaitTermination()
Output modes: append, update, complete
Triggers: processingTime, once, availableNow
> related_skills --same-repo
> Nix Dev Shells with direnv
Auto-activate reproducible dev environments with Nix flakes and direnv.
> Dagger with GitHub Actions
Run Dagger CI/CD pipelines in GitHub Actions for portable, testable builds.
> Bun + Hono API
Build fast APIs with Bun runtime and Hono framework.
> Deno Fresh Framework
Build full-stack web apps with Fresh on Deno. Islands, routes, and zero runtime overhead.
┌ stats
installs/wk0
░░░░░░░░░░first seenMar 18, 2026
└────────────