Spark-Redis connection

Hi,

I’m trying to connect Redis with Spark. I’d like to read a Redis HASH db into a Spark dataframe. I think I’m close to getting it to work. There don’t seem to be many code samples out there written for Python. After much effort, I think I’m finally able to run the code without error, however I’m not returning any results :frowning:

My data is located in Redis db 0. My code as follows:

spark = SparkSession.builder.appName(“myApp”).config(‘repositories’,‘app/spark-apps/spark-redis-2.3.1-M2-jar-with-dependencies.jar’).config(“spark.redis.host”, “redis-cache”).config(“spark.redis.port”, “6379”).config(“spark.redis.dbNum”, “0”).getOrCreate()

spark.conf.set("master", "spark://spark-master:7077")
sc = spark.sparkContext
sc.addPyFile("spark-apps/appx_wordcount_redis.zip")
sc.setLogLevel("ERROR")

data_schema = [StructField('id', StringType(), True), StructField('text', StringType(), True)]
final_struc = StructType(fields=data_schema)

df = spark.read.format("org.apache.spark.sql.redis")\
          .schema(final_struc)\
          .option("table", "0")\
          .option("key.column", "id").load()

df.show(10)
df.printSchema()

±–±—+
| id|text|
±–±—+
±–±—+

root
|-- id: string (nullable = true)
|-- text: string (nullable = true)

My Redis db0 HASH looks like this:

127.0.0.1:6379> HGETALL rec-25

  1. “text”
  2. "“I PURCHASED A SET OF BRAKE PADS FROM ““CARQUEST AUTO PARTS”” IN BETHLEHEM”
  3. “id”
  4. “732996”

Any ideas why no results?

Thanks!