Streamlined Data Access: Spark’s integration with YARN, HDFS, Hive, HBase and ORC



Data access engines are increasingly indispensable in our connected world where data is henceforth the most critical asset at the core of powerful stakes across all industries and vertical.


Hortonworks is improving Spark’s integration with YARN, HDFS, Hive, HBase and ORC because many among are running Spark on YARN in combination and in conjunction with different popular data access engines. 

Hortonworks is focusing on data access via the new Data Source API. The goal is to allow Spark SQL users to take full advantage of these capabilities:
  • ORC File instantiation as a table
  • Column pruning
  • Language integrated queries
  • Predicate pushdown

Popular Posts