Planning Your Cluster: Cluster Checklist
You've explored the concept of clustering and perhaps even have some first hand experience. Now its time to get serious about implementing a cluster system that will help you and your team achieve the highest levels of productivity. This section will help you with your evaluation and planning by outlining the most relevant data points that should be addressed during your cluster system design process. Our experts can walk you through this process and help determine a system that is just right for you and your organization's requirements.
- Identify the right cluster type
- Identify the best interconnect for your purposes
- Evaluate your storage requirements
- Determine your operating system and middleware software infrastructure requirements
- Estimate the correct number of nodes and sizing
- Anticipate future expansion plans
- Clarify installation considerations
- Assess the level of resources available to your group for system development and administration
- Consider your options by weighing the importance of these ten common evaluation criteria
1. Identify the right cluster type
Clusters are ideal for a wide variety of applications. In identifying a system configuration and cluster type, it is important to first consider the usage goal of your system resource. Are you looking for a system that meets specific availability demands or will this be simply a computational resource? Next, you must consider the behavior of your application as it will influence how you implement your cluster. Some basic questions you should ask include: Is your application more computational or transactional in nature? Is your code written to accommodate MPI or other parallel libraries or does it run serially? Does your application require significant access to memory, disk, or other peripherals? Another important consideration is environmental or power usage requirements.
2. Identify the best interconnect for your purposes
An important question to ask is whether or not your application is more compute or network intensive? If your application is embarrassingly parallel and individual parallel runs don't have heavy communication interdependencies, Gigabit Ethernet may be adequate to meet your needs.
However, if there is significant file sharing and interprocess communication within your code, you should consider a high bandwidth, low latency interconnect, such as Infiniband. Penguin Computing is an expert in deploying such networks and has knowledgeable systems engineers on hand that can help you decide which interconnect is best for you.
3. Evaluate your storage requirements
Proper protection and storage of your data is a critical consideration for your overall cluster solution. Depending on your needs and budget, there are many different storage options that range from lower cost JBOD (just a bunch of disks) configurations to more sophisticated Storage Area Networks (SAN) and Network Attached Storage (NAS) solutions. In addition, storage can include many different levels of redundancy for data protection and integrity. It is important to consider the amount of storage needed, the manner and speed in which you want your system to interact with and access data, and the level of redundancy and protection you want to build into your storage design. Penguin offers a full range of storage solutions and can assist you to architect one that meets your needs.
4. Determine your operating system and middleware software infrastructure requirements
The type of system you deploy, as well as the application and data requirements, will influence the software layers that will be most helpful in deploying a successful cluster solution. Regardless of the type of system or the application you are running, we recommend Scyld ClusterWare as an important layer of your overall system solution. Scyld ClusterWare simplifies system deployment and administration, helping to maximize system usage. Additional considerations must also be made for such things as job and batch schedulers, development tools, file systems and other productivity tools that meet your specific application demands. Penguin experts can help you in determining your system needs and implementing a software stack that makes sense.
5. Estimate the correct number of nodes and sizing
If you would like assistance determining the scale of cluster you require, our experts will help you understand the trade offs between performance and cost so you can achieve the best results from your cluster investment. We can also provide a real-world testing environment, wherein you have complete access to various system architectures to measure performance changes and determine how well your application scales.
6. Anticipate future expansion plans>
Identify other needs such as connections to other clusters or system infrastructure, expansion of your network infrastructure without requiring substantial replacement of key elements, storage needs to accommodate the incremental data that will be generated by more individuals taking advantage of the cluster.
7. Clarify installation considerations
Identify where you plan to physically install your cluster. Find out what voltage (120V/208V/220V) and current (10/20/30 Amperes) is available and if the power is conditioned. Investigate how much cooling is available.
You can save time and headaches by working with Penguin Computing to design an application ready turnkey cluster which would arrive pre-racked, wired, and fully configured. Our sales engineers can help you determine the optimal configuration for your specific needs.
8. Assess the level of resources available to your group for system deployment and administration
We recommend Scyld ClusterWare wherever possible because of the advantages it offers for system deployment and administration in terms of simplicity and ease of use. We recommend you consider the resources needed to implement or maintain the following functions:
- Initial system deployment and installation
- Disaster recovery
- Adding and deleting groups and users
- Adding and deleting additional cluster nodes
- Installing new applications and libraries
- Upgrading the kernel
- Mounting file systems
- Starting and restarting individual nodes as well as the entire system
- Use of out-of-band management software
- Compiling and running parallel applications
- Training appropriate to the complexity of the product and capabilities of technical staff
These are all problems solved by Scyld ClusterWare and Penguin Computing. Let us help simplify and eliminate your cluster computing concerns.
9. Evaluate your options by weighing the importance of these ten common evaluation criteria
- Total system price
- Administrative and maintenance costs
- Demonstrated ability to achieve target benchmarks
- Simplicity of packaging solution
- Expandability of network connections among clusters
- Quality of software environment, documentation and testing plan
- Ability to upgrade to a larger cluster
- Performance of storage systems
- Qualifications of installation team
- Support commitment and experience of vendor
