Your Location is: Home > Scala

Load a file to couchbase using spark

From: Tajikstan View: 4503 Rafa 

Question

I haven't found any solution clear for loading a file e into Couchbase using spark

I am having a file huge file with lot of records similar to this

ID|Content
prd_lct:11118:3|{"type":"prd_lct","lct_nbr":118,"itm_nbr":3,"locations":[{"lct_s12_id":1,"prd_121_typ_cd":1,"fxt_ail_id":"45","fxt_bay_id":"121","lvl_txt":"2"}],"itemDetails":[{"pmy_vbu_nbr":null,"upc_id":"1212121","vnd_mod_id":"1212121"}]}

My code

spark-shell --packages com.couchbase.client:spark-connector_2.11:2.2.0 --conf spark.couchbase.username=username --conf spark.couchbase.password=passrod --conf spark.couchbase.bucket.bucketname="" --conf spark.couchbase.nodes=http://1.2.3.4:18091,http://1.2.3.3:18091,http://1.2.3.5:18091

import com.couchbase.client.java.document.JsonDocument
import com.couchbase.client.java.document.json.JsonObject
import com.couchbase.spark._
import com.couchbase.spark.streaming._
import org.apache.spark.sql.{DataFrameReader, SQLContext}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}
val df = spark.read.option("delimiter", "|").option("header", true).csv("/hdfsData/test.doc").toDF()
df.createOrReplaceTempView("TempQuery")
spark.sql("select * from TempQuery").map(pair => { val ID = JsonArray.create()
    val content = JsonObject.create().put("ID", ID)
    pair._2.map(_.value.getString("Content")).foreach(ID.add)
    JsonDocument.create(pair._1, content)
  })
  .saveToCouchbase()

I know this is wrong , but i just started , new to Scala and Couchbase.

Please let me know your inputs, basically i have the key and value in a file separated by | and I wanted to loaded to the Couchbase

Best answer