(Page 1 of 1)

Big data performance challenges – scale-out, HDDs and Hadoop®

Organizations require vast computing resources to mine big data – massive amounts of information from various sources – for correlations, patterns and other valuable information that provide clear, actionable insights. In the end, these insights enable better business decisions and help drive competitive advantage. With big data, processing speed is crucial. Long job processing times are the enemy of fast decision-making, slowing access to actionable data.

What’s more, the sheer processing requirements of big data can lead to higher power consumption and hardware costs as organizations scale IT infrastructure to keep pace with explosive data growth. Traditional enterprise architectures like storage area networks can be ill-suited to big data because of the high latency inherent in networked storage. As an alternative, many IT organizations are scaling out server clusters using direct-attached hard disk drive (HDD) storage. Slow HDDs, however, choke performance in some phases of big data processing.

For its part, Hadoop® software has rapidly emerged as the open-source software framework of choice for big data analytics – for deployments such as e-commerce, research portals, energy simulations and live trading strategy simulations. For organizations to unlock the full value of big data analytics, the response times and bandwidth of Hadoop and other big data environments must be optimized.

Accelerate big data and Hadoop performance with flash

Databases such as HBase primarily use fast server memory to process big data retrieved from HDD storage, which can potentially degrade application performance. To accelerate Hadoop and similar deployments, some organizations are deploying flash storage in the form of server-based PCIe® flash. The low latency of flash helps organizations speed big data access times for faster decision-making and greater business agility.

LSI Nytro MegaRAID® solutions combine low-latency flash with HDD connectivity and data protection to help accelerate data analytics and reduce latency to move big data faster. LSI Nytro MegaRAID solutions feature an intelligent, integrated cache that automatically detects frequently accessed, or hot, data and moves it to high-performing flash, enabling organizations to:

  • Accelerate performance and turn data into actionable information faster than ever before
  • Reduce capital investment by spreading jobs over fewer machines
  • Decrease total cost of ownership by reducing power requirements and datacenter cooling costs
  • Scale TCO savings from small Hadoop clusters to large clusters based on business growth and data analytics potential
  • Optimize Hadoop environments by accelerating response times and increasing bandwidth of the intermediate data passed between Shuffle tasks in MapReduce