Spark Cast String Type to Integer Type (int)

Spark Cast String Type to Integer Type (int)

In Spark SQL, to convert/cast string type to integer type (int), you can use cast() function of Column class, this function can be used withColumn(), select(), selectexpr() and Can do with SQL expressions. This function argument takes a string that represents the type you wanted to convert or any type that is a subclass of the datatype.

1.Using select() Example

// Using select
df.select(col("salary").cast("int").as("salary")).printSchema()

//Using selectExpr()
df.selectExpr("cast(salary as int) salary").printSchema()

2. Setup a DataFrame

val spark = SparkSession.builder
      .master("local[1]")
      .appName("SparkByExamples.com")
      .getOrCreate()

val simpleData = Seq(("James",34,"true","M","3000.6089"),
         ("Michael",33,"true","F","3300.8067"),
         ("Robert",37,"false","M","5000.5034")
     )

import spark.implicits._
val df = simpleData.toDF("firstname","age","isGraduated","gender","salary")
df.printSchema()

Outputs below schema. Note that column salary is a string type.

spark convert string to Integer type

3. Using Spark SQL – Cast String to Integer Type

Spark SQL expression provides data type functions for casting and we can’t use cast() function. Below INT(string column name) is used to convert to Integer Type.

df.createOrReplaceTempView("CastExample")
df4=spark.sql("SELECT firstname,age,isGraduated,INT(salary) as salary from CastExample")

4. withColumn() – Cast String to Integer Type

import org.apache.spark.sql.functions.col
import org.apache.spark.sql.types.IntegerType

// Convert String to Integer Type
val df2= df.withColumn("salary",col("salary").cast(IntegerType))
df2.printSchema()
df2.sho

 

Spark convert String to Integer type
Read Also – Most Important Kubectl commands You Must Need to Know

Alternatively, you can also change the data type using below.


df.withColumn("salary",col("salary").cast("int"))
df.withC

In this simple Spark article, We have completed how to convert the DataFrame column from String Type to Integer Type using cast() function and applying it with withColumn(), select(), selectExpr() and finally Spark SQL table.

Hope you like this blog….
Mahesh Wabale
Latest posts by Mahesh Wabale (see all)

Leave a Comment