Highlights from Apache Big Data Conference in Budapest by Louis Rabiet, software engineer at Squid
We are only at the beginning of the hadoop revolution. Many of the features and possibilities offered by large scale solutions like what is offered by Apache Spark are still not used by many companies.
Many of the talks were presenting new features:
- Apache Drill that is capable out of the box of querying and discovering schemas for unstructured data (JSON).
- The addition of features in Hive to be able to do upsert, to do smart cache of queries (llap).
- Apache Kylin to provide advanced tool to use OLAP cubes.
A few of the take away ideas are:
- We need a standard for Hadoop; Incompatibilty and Non-transparency if a user wants to switch between Hive and Spark for example.
- We are still in need of tools to replace the unicorns (exceptional indivduals that are able to tweak and understand deeply the distributed systems).
- Apache foundation is providing an exciting framework to manage and motivate groups of skilled developpers to contribute and participate in the open-community.