Hadoop - Sequence and Map File

Hadoop Sequence Files:
-It is a flat file with binary key/value pairs.
-There are three different sequence file formats:
1.Uncompressed key/value records.
2.Record compressed key value records. - here values are compressed.
3.Block compressed key value records. -here both key and values are blocked separately and compressed.

Small File Problem:
All small files can be treated as values for each key and can be stored into a single sequence file.This will reduce overhead of Namenode for storing metadat information about each file.

MapFile:
The map file is actually a directory.  Within the same, there is an "index" file, and a "data" file.
The data file is a sequence file and has keys and associated values.
The index file is smaller, has key value pairs with the key being the actual key of the data, and the value, the byte offset.




Comments

  1. There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.


    Hadoop training chennai velachery
    Hadoop training velachery
    Hadoop training institute in t nagar

    ReplyDelete
  2. LEARN | GET JOB | HAPPY LIFE @ TIS Academy.......Thank you so much for sharing. it’s useful for me.Learn a training in Best Institute & get a 100% placement Assistant................ Selenium Training in Chennai

    Dot Net Training in Chennai
    Hadoop Training in Chennai


    ReplyDelete
  3. Thanks for sharing Valuable information. Greatful Info about hadoop. Really helpful. Keep sharing........... If it possible share some more tutorials.........

    ReplyDelete
  4. Before choosing a Job Oriented Training program it is important to evaluate your skills, interests, strength and weakness. Job Oriented Courses enable you to get a identity once you finish the same. Choose eNvent software Technology that suits you and make your career worthwhile.

    ReplyDelete
  5. I would suggest to take training from someone who is working in real time. Let me tell my story, I was working as a software engineer for a company. After 5 years it was very hard for me to move to other company as I my knowledge is very less. So I thought to change my platform to get new skills and new package in future. After continuous research I decided to take Hadoop training. So I googled on internet for best institute to learn Hadoop, shortlisted SV Soft Solutions institute and attended demo session, impressed to the trainer demo and joined. The course duration was 3 months. The trainer has great knowledge and he explained real time scenarios and taught real time project. I was able to clear my interview with great package. And finally moved to new company.
    You can also reach SV Soft Solutions http://www.svsoftsolutions.com,
    The trainer contact number is +1-845-915-8712, +91-9642373173

    ReplyDelete
  6. The Spring Framework is a lightweight framework for developing Java enterprise applications. It provides high performing, easily testable and reusable code. Spring handles the infrastructure as the underlying framework so that you can focus on your application.Spring is modular in design, thereby making creation, handling and linking of individual components so much easier. Spring implements Model View Container(MVC) design pattern.
    spring custom validator example

    ReplyDelete

Post a Comment

Popular posts from this blog

JDBC Hive Connection fails : Unable to read HiveServer2 uri from ZooKeeper

Access Kubernetes ConfigMap in Spring Boot Application

Developing Custom Processor in Apache Nifi