Make or buy? Which is the right infrastructure for big data

Share:

The question that involves the "make or buy" alternative never goes out of fashion, it means to build in-house or find a product or service outside the company. In the case of the most suitable infrastructure for big data, this question needs some considerations that can help choosing a solution rather than others.

Indeed, the variables are many, starting from the first that must guide the company to orient itself: what is the need that must be met by a management of big data that contributes to the increase in business?

Greater knowledge of customer behavior obtainable from various data sources (big data marketing)? A global vision of processes in the production and logistics cycle (big data analytics in key Industry 4.0)? A wider range of financial information (big data analytics to support the augmented CFO)? These are some of the potential applications, that it is good for companies, to clarify before proceeding in identifying the right infrastructure to support the objectives they intend to achieve.

 

From the timing of data analysis to their multiplicity

The identification of the specific requirement guides the timing with respect to the availability of data analysis, which is not always required in real time mode. It can be sufficient to develop batch processing carried out at intervals deferred over time, on demand, that is, it can be reached when needed, or, especially in the case of IoT devices, in streaming to obtain information continuously and in real time.

Similarly, according to the type of data, their lack of homogeneity and multiplicity, storage also changes, moving from the classic SQL relational databases to NoSQL and up to the recent NewSQL.

That’s mean that, the infrastructure must be able to collect, store and process big data, ensuring, in the same time, certain security standards, to make them inviolable both when they are at rest and when they are in transit. Therefore, we need storage systems and servers, frameworks, databases, analytics software and other applications.

Finally, the infrastructure can be on premise or in the form of a remote data center in the cloud. The "make or buy" is played mainly on these last two possibilities.

 

On premise infrastructure versus cloud infrastructure

If the costs were to be compared between an on-premise solution and a cloud solution, the TCO (Total Cost of Ownership) should be placed, in the first case, alongside the ROI (Return on Investment), in the second case, providing for an allocation of resources immobilized or distributed over time. It is very probable that from this comparison local and cloud infrastructure would be substantially equal. Unless, of course, the computation does not consider also the required load to the IT team on which the management, updating and maintenance costs would fall entirely in the event of "make", that is local installation. If you want to save, then, you could resort to large-scale open-source data processing frameworks such as Hadoop and Spark, to mention the best known. Be careful, however, not to underestimate the complexity of managing these tools created specifically for big data. The commitment to IT staff could be so burdensome as to totally absorb it, taking it away from everything else.

Make or buy? Better to renting together with an experienced partner

Considering the problem of the right infrastructure for big data goes hand in hand with that of the specialized expertise supplied. That’s why, the IT departments of the companies do not necessarily have the data scientist or programming experts in languages ​​such as R and Python that are part of the world of big data analysis. For this reason, the answer to "make or buy" alternative is a third way: renting. The so-called operational rental, which is already used by companies not to purchase technologies subject to rapid obsolescence, can also embrace the infrastructure for big data in three variants: Infrastructure as a Service (IaaS) for calculation and storage, Platform as a Service (PaaS) for databases, Software as a Service (SaaS) for ready-to-use applications. The important thing is that their adoption takes place together with a consultancy provided by an experienced partner, who knows the world of big data in depth and knows how to derive value from it based on the company's goals.