Friday, December 24, 2010
Saturday, December 11, 2010
포스코 “종합 소재기업으로 성장”
출처: http://www.greendaily.co.kr/news/articleView.html?idxno=10909
| ||||||
| ||||||
포스코는 17일 인천 송도 글로벌 R&D센터에서 고객사인 글로벌기업 430여개사, 900여명을 초청해 ‘포스코 글로벌 EVI 포럼 2010’을 개최했다. 포스코는 이날 모든 기술·제품개발 초기부터 부품업체와 협력하는 EVI(Early Vendor Involvement)전략에서 나아가 전 산업에 걸쳐 고객사에게 제품 및 기술개발을 선 제안해 토털 솔루션을 공급하는 포스코형 EVI(Expanded Value Initiative for customer)라는 독자적인 마케팅 전략을 추진한다고 밝혔다. 이에 따라 포스코는 전 수요산업을 △철강 수요 비중이 높은 주요 산업군 △잠재 성장성이 큰 신규 산업군 △대체재·저가재 위협에 따른 프로젝트 산업군으로 구분하고, 적극적인 시장 개척을 위한 경쟁력 있는 기술 개발에 주력하는 포괄적 EVI 활동을 전개할 계획이다. 그동안 신닛테쓰·아르셀로미탈 등 글로벌 선진 철강사가 자동차 기업 중심으로 EVI 활동을 추진해 왔다. 가전·조선·에너지·건설·중장비 등 전 사업 고객사를 대상으로 EVI 활동을 하는 것은 포스코가 처음이다. 먼저 철강수요 비중이 높은 주요 산업군인 자동차부문에서는 차체 및 부품의 경량화를 추진하고, 신재생에너지·건자재·해양플랜트 등 잠재 성장성이 큰 신규 산업군은 신개념의 풍력타워 및 건설 중장비의 경량화와 기존 소재를 대체할 수 있는 고강도 제품 개발에 주력한다는 계획이다. 정준양 회장은 이날 개회사에서 “불확실한 경쟁 환경에서 살아남을 수 있는 길은 공급 체인상의 모든 경영 주체들이 동반성장을 위해 함께 뛰는 것”이라며 “제품과 서비스에 혼을 담아 고객을 섬김으로써 포스코와의 거래 자체가 고객에게 행복을 주고 성공에 디딤돌 역할을 해야 한다”고 강조했다. 한편, 이날 행사에는 도요타·소니·엑슨모빌·캐터필러 등 각 산업을 대표하는 글로벌 톱 기업들이 참가했다. 포스코는 이날 국내외 주요 고객사와 장기 소재 공급 및 공동 기술개발 추진 등 30여건의 양해각서(MOU)를 교환하는 성과를 거뒀다. |
Gartner Symposium/ITxpo Webinar Series - Orlando
http://mediazone.brighttalk.com/event/Gartner/27d8d40b22-4312-event
Gartner Symposium/ITxpo Webinar Series - Orlando
|
Friday, December 10, 2010
불 꺼지지 않는 포스코센터 8층 ‘소재사업실’
출처: http://www.asiae.co.kr/news/view.htm?idxno=2009100709164445942
|
Gartner Identifies the Top 10 Strategic Technologies for 2011
출처: http://www.gartner.com/it/page.jsp?id=1454221
Gartner Identifies the Top 10 Strategic Technologies for 2011
Analysts Examine Latest Industry Trends During Gartner Symposium/ITxpo, October 17-21, in Orlando
STAMFORD, Conn., October 19, 2010 —
Gartner, Inc. today highlighted the top 10 technologies and trends that will be strategic for most organizations in 2011. The analysts presented their findings during Gartner Symposium/ITxpo, being held here through October 21.
Gartner defines a strategic technology as one with the potential for significant impact on the enterprise in the next three years. Factors that denote significant impact include a high potential for disruption to IT or the business, the need for a major dollar investment, or the risk of being late to adopt.
A strategic technology may be an existing technology that has matured and/or become suitable for a wider range of uses. It may also be an emerging technology that offers an opportunity for strategic business advantage for early adopters or with potential for significant market disruption in the next five years. As such, these technologies impact the organization's long-term plans, programs and initiatives.
“Companies should factor these top 10 technologies in their strategic planning process by asking key questions and making deliberate decisions about them during the next two years,” said David Cearley, vice president and distinguished analyst at Gartner.
“Sometimes the decision will be to do nothing with a particular technology,” said Carl Claunch, vice president and distinguished analyst at Gartner. “In other cases, it will be to continue investing in the technology at the current rate. In still other cases, the decision may be to test or more aggressively deploy the technology.”
The top 10 strategic technologies for 2011 include:
Cloud Computing. Cloud computing services exist along a spectrum from open public to closed private. The next three years will see the delivery of a range of cloud service approaches that fall between these two extremes. Vendors will offer packaged private cloud implementations that deliver the vendor's public cloud service technologies (software and/or hardware) and methodologies (i.e., best practices to build and run the service) in a form that can be implemented inside the consumer's enterprise. Many will also offer management services to remotely manage the cloud service implementation. Gartner expects large enterprises to have a dynamic sourcing team in place by 2012 that is responsible for ongoing cloudsourcing decisions and management.
Mobile Applications and Media Tablets. Gartner estimates that by the end of 2010, 1.2 billion people will carry handsets capable of rich, mobile commerce providing an ideal environment for the convergence of mobility and the Web. Mobile devices are becoming computers in their own right, with an astounding amount of processing ability and bandwidth. There are already hundreds of thousands of applications for platforms like the Apple iPhone, in spite of the limited market (only for the one platform) and need for unique coding.
The quality of the experience of applications on these devices, which can apply location, motion and other context in their behavior, is leading customers to interact with companies preferentially through mobile devices. This has lead to a race to push out applications as a competitive tool to improve relationships and gain advantage over competitors whose interfaces are purely browser-based.
Social Communications and Collaboration. Social media can be divided into: (1) Social networking —social profile management products, such as MySpace, Facebook, LinkedIn and Friendster as well as social networking analysis (SNA) technologies that employ algorithms to understand and utilize human relationships for the discovery of people and expertise. (2) Social collaboration —technologies, such as wikis, blogs, instant messaging, collaborative office, and crowdsourcing. (3)Social publishing —technologies that assist communities in pooling individual content into a usable and community accessible content repository such as YouTube and flickr. (4) Social feedback - gaining feedback and opinion from the community on specific items as witnessed on YouTube, flickr, Digg, Del.icio.us, and Amazon. Gartner predicts that by 2016, social technologies will be integrated with most business applications. Companies should bring together their social CRM, internal communications and collaboration, and public social site initiatives into a coordinated strategy.
Video. Video is not a new media form, but its use as a standard media type used in non-media companies is expanding rapidly. Technology trends in digital photography, consumer electronics, the web, social software, unified communications, digital and Internet-based television and mobile computing are all reaching critical tipping points that bring video into the mainstream. Over the next three years Gartner believes that video will become a commonplace content type and interaction model for most users, and by 2013, more than 25 percent of the content that workers see in a day will be dominated by pictures, video or audio.
Next Generation Analytics. Increasing compute capabilities of computers including mobile devices along with improving connectivity are enabling a shift in how businesses support operational decisions. It is becoming possible to run simulations or models to predict the future outcome, rather than to simply provide backward looking data about past interactions, and to do these predictions in real-time to support each individual business action. While this may require significant changes to existing operational and business intelligence infrastructure, the potential exists to unlock significant improvements in business results and other success rates.
Social Analytics. Social analytics describes the process of measuring, analyzing and interpreting the results of interactions and associations among people, topics and ideas. These interactions may occur on social software applications used in the workplace, in internally or externally facing communities or on the social web. Social analytics is an umbrella term that includes a number of specialized analysis techniques such as social filtering, social-network analysis, sentiment analysis and social-media analytics. Social network analysis tools are useful for examining social structure and interdependencies as well as the work patterns of individuals, groups or organizations. Social network analysis involves collecting data from multiple sources, identifying relationships, and evaluating the impact, quality or effectiveness of a relationship.
Context-Aware Computing. Context-aware computing centers on the concept of using information about an end user or object’s environment, activities connections and preferences to improve the quality of interaction with that end user. The end user may be a customer, business partner or employee. A contextually aware system anticipates the user's needs and proactively serves up the most appropriate and customized content, product or service. Gartner predicts that by 2013, more than half of Fortune 500 companies will have context-aware computing initiatives and by 2016, one-third of worldwide mobile consumer marketing will be context-awareness-based.
Storage Class Memory. Gartner sees huge use of flash memory in consumer devices, entertainment equipment and other embedded IT systems. It also offers a new layer of the storage hierarchy in servers and client computers that has key advantages — space, heat, performance and ruggedness among them. Unlike RAM, the main memory in servers and PCs, flash memory is persistent even when power is removed. In that way, it looks more like disk drives where information is placed and must survive power-downs and reboots. Given the cost premium, simply building solid state disk drives from flash will tie up that valuable space on all the data in a file or entire volume, while a new explicitly addressed layer, not part of the file system, permits targeted placement of only the high-leverage items of information that need to experience the mix of performance and persistence available with flash memory.
Ubiquitous Computing. The work of Mark Weiser and other researchers at Xerox's PARC paints a picture of the coming third wave of computing where computers are invisibly embedded into the world. As computers proliferate and as everyday objects are given the ability to communicate with RFID tags and their successors, networks will approach and surpass the scale that can be managed in traditional centralized ways. This leads to the important trend of imbuing computing systems into operational technology, whether done as calming technology or explicitly managed and integrated with IT. In addition, it gives us important guidance on what to expect with proliferating personal devices, the effect of consumerization on IT decisions, and the necessary capabilities that will be driven by the pressure of rapid inflation in the number of computers for each person.
Fabric-Based Infrastructure and Computers. A fabric-based computer is a modular form of computing where a system can be aggregated from separate building-block modules connected over a fabric or switched backplane. In its basic form, a fabric-based computer comprises a separate processor, memory, I/O, and offload modules (GPU, NPU, etc.) that are connected to a switched interconnect and, importantly, the software required to configure and manage the resulting system(s). The fabric-based infrastructure (FBI) model abstracts physical resources — processor cores, network bandwidth and links and storage — into pools of resources that are managed by the Fabric Resource Pool Manager (FRPM), software functionality. The FRPM in turn is driven by the Real Time Infrastructure (RTI) Service Governor software component. An FBI can be supplied by a single vendor or by a group of vendors working closely together, or by an integrator — internal or external.
A video reply of the Top 10 Strategic Technologies presentation will be available via the Gartner Symposium/ITxpo Webinar Series. The webinar series will provide full video replays of the Gartner Symposium/ITxpo keynotes, as well as selected Gartner analyst presentation. More information is available athttp://mediazone.brighttalk.com/event/Gartner/27d8d40b22-4312-intro.
About Gartner Symposium/ITxpoCelebrating its 20th anniversary, Gartner Symposium/ITxpo is the world's most important gathering of CIOs and senior IT executives. This event delivers independent and objective content with the authority and weight of the world's leading IT research and advisory organization, and provides access to the latest solutions from key technology providers. Gartner's annual Symposium/ITxpo events are key components of attendees' annual planning efforts. IT executives rely on Gartner Symposium/ITxpo to gain insight into how their organizations can use IT to address business challenges and improve operational efficiency. Additional information is available at www.gartner.com/symposium/us.
More exclusive content, expanding multi-media coverage, including Twitter feeds and comments from the Gartner Blog Network will be available at Gartner’s SymLive site at http://gartner.com/symlive.
Upcoming dates and locations for Gartner Symposium/ITxpo include:
October 25-27, Tokyo, Japan: www.gartner.com/jp/symposium
November 8-11, Cannes, France: www.gartner.com/eu/symposium
November 16-18, Sydney, Australia: www.gartner.com/au/symposium
October 25-27, Tokyo, Japan: www.gartner.com/jp/symposium
November 8-11, Cannes, France: www.gartner.com/eu/symposium
November 16-18, Sydney, Australia: www.gartner.com/au/symposium
Follow GartnerFollow news, photos and video coming from Gartner Symposium/ITxpo on Facebook at http://www.facebook.com/home.php#/Gartner?ref=ts, on Twitter athttp://twitter.com/Gartner_incand using #GartnerSym, on flickr athttp://www.flickr.com/photos/27772229@N07/.
Top Ten ERP Software Predictions for 2011
출처: http://it.toolbox.com/blogs/erp-roi/top-ten-erp-software-predictions-for-2011-42364
Top Ten ERP Predictions for 2011
What does this all mean to our clients and other companies considering ERP investments in the coming year? The companies that choose the right ERP software for their organizations, best manage business and organizational risk, implement effectively, and position themselves for benefits realization will be better positioned as they head into the new year. This will require companies to more effectively assess vendor viability during their ERP selection processes and leverage ERP implementation best practices more than they have in the past.
Panorama continues to provide tools, expertise, and resources to those wanting to effectively navigate their ERP system challenges in the new year. Visit our resource center to download industry reports, white papers, benchmark metrics, and other useful information related to ERP selection and implementation best practices.
Happy holidays and here's to a prosperous and successful 2011!
Top Ten ERP Predictions for 2011
- Risk management and mitigation. Even though the economy may not be quite as bad as it was at this time last year, companies are still extremely risk adverse. They are not willing to spend millions of dollars on ERP software that are difficult to implement or don't deliver measurable value. When they do go to implement, executives are going to rely on outside consultants and experts to help them manage and minimize risk.
- Increasing focus on organizational change management. Since risk management is the name of the game for CIOs, and executives are finally smartening up and realizing that organizational change management is arguably the single best way to mitigate and manage implementation risk. As recently as 2-3 years ago, before the current recession began, companies viewed org change as an optional and nice-to-have implementation activity - now they are realizing that it is critical.
- Increasing need for ERP business cases, ROI analysis, and benefits realization. In the latter half of 2010, we saw a marked shift to organizations focusing on clearly defining a business case and conducting an ROI analysis to assess the viability of their ERP initiatives. Given the risk aversion of many companies, this trend is likely to continue into 2011. This quantitative focus has been a core part of Panorama's methodology since its inception, so this is a welcome trend that will ultimately benefit companies.
- ERP lawsuits and canceled ERP projects. Despite companies' desires to mitigate risk and focus on organizational change, they are still going to be pressured by slim IT budgets in the new year. This is going to create a conflicting pressure to cut costs in the wrong places, which will ultimately increase the rate of ERP failures. In addition, because of the low tolerance for risk, companies will be faster to pull the plug on troubled projects and file ERP lawsuits against their vendors if needed.
- ERP vendors will get their "mojo" back. Up until recent months, most ERP vendors were getting hammered by a mix of increased competition, tight IT budgets, and mediocre financial results. Signs in the latter half of 2010 pointed to increasing IT spending and pent-up demand for enterprise systems, which is likely to continue into 2011. This should give software vendors increased confidence to hold the line on software pricing, invest more in R&D, and provide more product enhancements.
- ERP vendor consolidation. Even though ERP vendors as a whole will be stronger in the coming year compared to years past, they all won't be so lucky. As we emerge from the recession, the market will diverge into a class of stronger ERP vendors and a class of weaker players. Look for the stronger players to acquire some of the weaker ones, resulting in a wave of consolidation.
- Heavy adoption of Software as a Service (SaaS) models at small and mid-size businesses (SMBs). Assuming SMBs and start-ups lead us out of the economic doldrums as they have in past recessions, they will look to enterprise software to provide their business foundations for growth. However, these bootstrapped start-ups aren't likely to have the capital funds for heavy up-front costs, so they will likely look more to SaaS ERP and CRM systems.
- Continued buzz around cloud computing. While SaaS ERP systems are still years away from capturing a significant portion of the ERP market among mid-size to large organizations, CIOs will continue to look at other cloud computing options. For example, hosted ERP solutions and outsourced IT infrastructures will likely be on the minds of many CIOs. In addition, although larger companies may not yet be in a position to adopt enterprise-wide SaaS models, they will continue to evaluate targeted SaaS solutions, such as Document Management Systems (DMS), Human Resource Systems (HRM/HCM), Product Lifecycle Management (PLM), and Customer Relationship Management (CRM).
- A good year for CRM software. Most companies have cut their operating and labor costs to the bone throughout the recession. Most are also starting to realize that the only way to make it out of the recession stronger is to fuel top-line growth and sales, and most will do so without hiring too many new sales and customer service reps. For this reason, companies will look to CRM software and social CRM applications to help makes their existing sales and customer service functions more effective and efficient.
- More focus on diagnostics, analytics, and business intelligence. Companies have reduced their margins of error for missteps during the recession, so companies will continue to rely on their ERP systems to help provide operational data to help make better and more informed decisions. Look for diagnostics, analytics, and business intelligence applications to gain momentum in the coming year.
What does this all mean to our clients and other companies considering ERP investments in the coming year? The companies that choose the right ERP software for their organizations, best manage business and organizational risk, implement effectively, and position themselves for benefits realization will be better positioned as they head into the new year. This will require companies to more effectively assess vendor viability during their ERP selection processes and leverage ERP implementation best practices more than they have in the past.
Panorama continues to provide tools, expertise, and resources to those wanting to effectively navigate their ERP system challenges in the new year. Visit our resource center to download industry reports, white papers, benchmark metrics, and other useful information related to ERP selection and implementation best practices.
Happy holidays and here's to a prosperous and successful 2011!
Top 10 ERP Software Predictions for 2010
출처 : http://it.toolbox.com/blogs/erp-roi/top-10-erp-software-predictions-for-2010-36035
Top 10 ERP Software Predictions for 2010
A new decade is upon us and the ERP software industry looks quite different than it did at the start of the decade. Ten years ago, the enterprise software space was booming, IT budgets were flush, and companies were replacing systems left and right in preparation for Y2K.
In contrast, the decade closes with depressed IT spending levels, revenue contraction among many ERP vendors, and uncertainty about the future. However, there are several things to be optimistic about. Here are our ten predictions for the enterprise software space in 2010:
1. Diligent focus on ERP benefits realization and ROI. Long gone are the days of spending like it's 1999 and hoping for the best. CIOs and COOs will continue to face pressure to prove that every dime of investment in ERP systems is justified and generates a solid return on investment. Look for more deliberate spending, more phased rollouts, buying licenses only as they're needed, and hesitancy to invest in more expensive advanced enterprise software modules.
2. SMBs to get back into the ERP software market. The bright spot in any recovering economy is usually small business (SMBs). As the economy emerges from the recession, SMBs will look for small business software to automate their operations and scale for growth. In addition, large software vendors such as SAP and Oracle will continue to focus on the SMB market to reinvigorate their revenue growth in software license sales.
3. Increased adoption of Software as a Service (SaaS) at SMBs. While SMBs may lead the charge in their small business software investments, it may be difficult for them to make the necessary investments. Given that tight credit markets will likely continue into the new decade, many SMBs will look to SaaS enterprise software to help them minimize up front capital IT costs.
4. Lots of SaaS talk, but not as much action at large organizations. Larger companies, on the other hand, are likely to consider SaaS options, but are much less likely than their SMB counterparts to commit to these deployment models. As software vendors expand hybrid solutions combining the benefits of SaaS with the flexibility of traditional ERP (e.g. Oracle's On Demand and SAP's Business By Design offerings), larger organizations will continue opting for non-SaaS options that more commonly reduce cost and risk while maximizing business benefits in the long-term. They will, however, be more inclined to leverage SaaS for some niche functions, such as Document Management Systems (DMS), Human Resource Systems (HRM/HCM), Product Lifecycle Management (PLM), and Customer Relationship Management (CRM).
5. Increasing focus on organizational change management and benefits realization. As demonstrated by the exponential growth in Panorama's organizational change management practice, companies are directing much of their ERP software investments to areas that ensure they implement effectively and get more out of their existing enterprise investments. The need to more effectively manage organizational and business risk will likely result in a continuation of this trend in 2010.
6. It's still a buyers' market. Even in the most optimistic scenario, overall 2010 enterprise software spending will not return to pre-recession levels. This means ERP software buyers will remain in the driver's seat, which will be reflected in aggressive software pricing and shared benefits implementation models, such as that introduced by Epicor late this year.
7. Enterprise software risk management. As CIOs and executive teams remain on the hot seat to prove the value of their investments, risk management will be the name of the game. Look for more ERP implementations to leverage organizational change management and independent oversight of software vendors to help mitigate business risk.
8. Software vendor consolidation. Vendor competition was fierce before the recession and is even more so now. Dozens of smaller vendors are starved for cash and unable to fuel R&D and other product innovations without infusions of capital. Add the fact that larger vendors have cash and some have grown successfully via acquisition to date (e.g. Oracle and Infor), and continued vendor consolidation looks inevitable.
9. Focus on integration rather than major product enhancements. Given corporate aversion to risk, companies are going to be less likely to bet on entirely new products or risky upgrades. As a result, vendors are more likely to invest in incremental product enhancements and tighter integration between modules rather than revolutionary changes to their software.
10. Niches, low-hanging fruit, and business value. Look for companies to be very deliberate about how they invest in enterprise software, the risk they're willing to take, and how they manage implementations. If executives aren't convinced that their enterprise software investments will deliver measurable business value, they won't invest in it. Areas that deliver immediate value are priorities for the coming year.
We are optimistic about the coming year and can't help but wonder if the economic recession exactly what the enterprise software market needed. ERP failures, cost overruns, difficult software vendors, and lack of business benefits had become too frequent, but these lean times will not allow for these trends to continue.
So what does this mean to clients and other companies considering ERP investments in the coming year? The companies that choose the right software for their organizations, best manage business and organizational risk, implement effectively, and position themselves for benefits realization will be better positioned headed into the recovery. This will require companies to more effectively assess vendor viability during their ERP selection processes and leverage ERP implementation best practices more than they have in the past.
Top Trends in ERP for 2010
http://blogs.dlt.com/top-10-trends-erp-2010-part/
http://blogs.dlt.com/erp-forecast-top-trends-erp-2010-part-ii/
http://blogs.dlt.com/top-trends-erp-2010-part-iii/
http://blogs.dlt.com/top-trends-in-erp-2010-part-iv/
http://blogs.dlt.com/erp-forecast-top-trends-erp-2010-part-ii/
http://blogs.dlt.com/top-trends-erp-2010-part-iii/
http://blogs.dlt.com/top-trends-in-erp-2010-part-iv/
- Upgrade and footprint expansion activity.
- Open Source.
- Small businesses going ERP sooner.
- Mobile ERP.
- New enterprise resource functionality: energy utilization
- 3rd party support vs. maintenance contract renewals
- New-growth markets
- Expanding ERP
- SaaS is probably the most significant non-ERP trend to look forward to
- Micro-Verticalization will be delivered by channel partners.
Saturday, December 4, 2010
Wednesday, December 1, 2010
hadoop on cygwin
Hadoop is a distributed computing platform.
Hadoop primarily consists of the Hadoop Distributed FileSystem (HDFS) and an implementation of the Map-Reduce programming paradigm.
Hadoop is a software framework that lets one easily write and run applications that process vast amounts of data. Here's what makes Hadoop especially useful:
- Scalable: Hadoop can reliably store and process petabytes.
- Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
- Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
- Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.
Requirements
Platforms
- Hadoop was been demonstrated on GNU/Linux clusters with 2000 nodes.
- Win32 is supported as a development platform. Distributed operation has not been well tested on Win32, so this is not a production platform.
Requisite Software
- Java 1.6.x, preferably from Sun. Set JAVA_HOME to the root of your Java installation.
- ssh must be installed and sshd must be running to use Hadoop's scripts to manage remote Hadoop daemons.
- rsync may be installed to use Hadoop's scripts to manage remote Hadoop installations.
Additional requirements for Windows
- Cygwin - Required for shell support in addition to the required software above.
Installing Required Software
If your platform does not have the required software listed above, you will have to install it.For example on Ubuntu Linux:
$ sudo apt-get install ssh $ sudo apt-get install rsync
On Windows, if you did not install the required software when you installed cygwin, start the cygwin installer and select the packages:
- openssh - the "Net" category
- rsync - the "Net" category
Getting Started
First, you need to get a copy of the Hadoop code.Edit the file conf/hadoop-env.sh to define at least JAVA_HOME.
Try the following command:
bin/hadoopThis will display the documentation for the Hadoop command script.
Standalone operation
By default, Hadoop is configured to run things in a non-distributed mode, as a single Java process. This is useful for debugging, and can be demonstrated as follows:mkdir input
cp conf/*.xml input
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
cat output/*This will display counts for each match of the regular expression.
Note that input is specified as a directory containing input files and that output is also specified as a directory where parts are written.
Distributed operation
To configure Hadoop for distributed operation you must specify the following:- The NameNode (Distributed Filesystem master) host. This is specified with the configuration property fs.default.name.
- The
JobTracker
(MapReduce master) host and port. This is specified with the configuration property mapred.job.tracker. - A slaves file that lists the names of all the hosts in the cluster. The default slaves file is conf/slaves.
Pseudo-distributed configuration
You can in fact run everything on a single host. To run things this way, put the following in:conf/core-site.xml:
Now check that the command
ssh localhost
does not require a password. If it does, execute the following commands:
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Bootstrapping
A new distributed filesystem must be formatted with the following command, run on the master node:bin/hadoop namenode -format
The Hadoop daemons are started with the following command:
bin/start-all.sh
Daemon log output is written to the logs/ directory.
Input files are copied into the distributed filesystem as follows:
bin/hadoop fs -put input input
Distributed execution
Things are run as before, but output must be copied locally to examine it:bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
bin/hadoop fs -get output output cat output/*When you're done, stop the daemons with:
bin/stop-all.sh
Fully-distributed operation
Fully distributed operation is just like the pseudo-distributed operation described above, except, specify:- The hostname or IP address of your master server in the value for fs.default.name, as hdfs://master.example.com/ in conf/core-site.xml.
- The host and port of the your master server in the value of mapred.job.tracker as master.example.com:port in conf/mapred-site.xml.
- Directories for dfs.name.dir and dfs.data.dir in conf/hdfs-site.xml. These are local directories used to hold distributed filesystem data on the master node and slave nodes respectively. Note that dfs.data.dir may contain a space- or comma-separated list of directory names, so that data may be stored on multiple local devices.
- mapred.local.dir in conf/mapred-site.xml, the local directory where temporary MapReduce data is stored. It also may be a list of directories.
- mapred.map.tasks and mapred.reduce.tasks in conf/mapred-site.xml. As a rule of thumb, use 10x the number of slave processors for mapred.map.tasks, and 2x the number of slave processors for mapred.reduce.tasks.
Hadoop 0.20.S Virtual Machine Appliance
http://developer.yahoo.com/blogs/hadoop/posts/2010/06/hadoop_020s_virtualmachine/
Hadoop 0.20.S Virtual Machine Appliance
Tue June 29, 2010 (Updated)
by Ajay Anand
At Yahoo!, we recently implemented a stronger notion of security for the Hadoop platform, based on Kerberos as underlying authentication system. We also successfully enabled this feature within Yahoo! on our internal data processing clusters. I am sure many Hadoop developers and enterprise users are looking forward to get hands-on experience with this enterprise-class Hadoop Security feature.
In the past, we've aided developers and users get started with Hadoop by hosting a comprehensive Hadoop tutorial on YDN, along with a pre-configured single node Hadoop (0.18.0) Virtual Machine appliance.
This time, we decided to upgrade this Hadoop VM with a pre-configured single node Hadoop 0.20.S cluster, along with required Kerberos system components. We have also included Pig (version 0.7.0), a high level SQL-like data processing language used at Yahoo!.
This blog post describes how to get started with the Hadoop 20.S VM appliance. The basic information about downloading, setting up VM Player, and using the Hadoop VM is same as described in the tutorial module-3 — except the user has to use the following information and links to download the latest VM Player and Hadoop 0.20.S VM Image. You should also review the following information for security-specific commands that need to be performed before running M/R or Pig jobs.
For more details on deploying and configuring Yahoo! Hadoop 0.20.S security distribution, look for continuing announcements and details on Hadoop-YDN.
Installing and Running the Hadoop 0.20.S Virtual Machine:
- Virtual Machine and Hadoop environment: See details here.
- Install VMware Player: See details here. To download latest VMware Player for Windows/Linux, go to Vmware site
- Setting up the Virtual Environment for Hadoop 0.20.S:
Copy the [Hadoop 0.20.S Virtual Machine] into a location on your hard drive. It is a zipped vmware folder (hadoop-vm-appliance-0-20-S, appriox ~400MB), which includes a few files: a .vmdk file that is a snapshot of the virtual machine's hard drive, and a .vmx file that contains the configuration information to start the virtual machine. After unzipping the vmware folder zip file, to start the virtual machine, double-click on the hadoop-appliance-0.20.S.vmx file. Note: Uncompressed Size of hadoop-vm-appliance-0-20-S folder is ~2GB. Also, based on that data you upload for testing, VM disk is configured to grow up to 20GB). When you start the virtual machine for the first time, VMware Player will recognize that the virtual machine image is not in the same location it used to be. You should inform VMware Player that you copied this virtual machine image (choose "I copied it"). VMware Player will then generate new session identifiers for this instance of the virtual machine. If you later move the VM image to a different location on your own hard drive, you should tell VMware Player that you have moved the image. After you select this option and click OK, the virtual machine should begin booting normally. You will see it perform the standard boot procedure for a Linux system. It will bind itself to an IP address on an unused network segment, and then display a prompt allowing a user to log in. Note: The IP address displayed on the login screen can be used to connect to VM instance over SSH. The Login screen also displays information about starting/stopping Hadoop daemons, users/passwords, and how to shutdown the VM. Note: It is much more convenient to access the VM via SSH. See details here.
- Virtual Machine User Accounts:
The virtual machine comes pre-configured with two user accounts: "root" and "hadoop-user". The hadoop-user account has sudo permissions to perform system-management functions, such as shutting down the virtual machine. The vast majority of your interaction with the virtual machine will be as hadoop-user. To log in as hadoop-user, first click inside the virtual machine's display. The virtual machine will take control of your keyboard and mouse. To escape back into Windows at any time, press CTRL+ALT at the same time. The hadoop-user user's password is
hadoop
. To log in as root, the password is root
.- Hadoop Environment:
Linux : Ubuntu 8.04
Java : JRE 6 Update 7 (See License info @ /usr/jre16/) Hadoop : 0.20.S (installed @ /usr/local/hadoop, /home/hadoop-user/hadoop is symlink to install directory) Pig : 0.7.0 (pig jar is installed @ /usr/local/pig, /home/hadoop-user/pig-tutorial/pig.jar is symlink to
one in install directory)
Login: hadoop-user, Passwd: hadoop (sudo privileges are granted for hadoop-user). The other usrers are hdfs and mapred (passwd: hadoop). Hadoop VM starts all the required hadoop and Kerberos daemons during the boot-up process, but in case the user needs to stop/restart,
- To start/stop/restart hadoop: login as hadoop-user and run 'sudo /etc/init.d/hadoop [start | stop | restart]' ('sudo /etc/init.d/hadoop' gives the usage)
- To format the HDFS & clean all state/logs: login as hadoop-user and run 'sudo reinit-hadoop'
- To start/stop/restart Kerberos KDC Server: login as hadoop-user and run 'sudo /etc/init.d/krb5-kdc [start | stop | restart]'
- To start/stop/restart Kerberos ADMIN Server: login as hadoop-user and run 'sudo /etc/init.d/krb5-admin-server [start | stop | restart]'
- Running M/R Jobs:
Running M/R jobs in Hadoop 0.20.S is pretty much same as running them in non-secure version of Hadoop. Except before running any Hadoop Jobs or HDFS commands, the hadoop-user needs to get the Kerberos authentication token using the command 'kinit'; the password is
hadoopYahoo1234
.
For example: hadoop-user@hadoop-desk:~$ cd hadoop hadoop-user@hadoop-desk:~$ kinit Password for hadoop-user@LOCALDOMAIN: hadoopYahoo1234 hadoop-user@hadoop-desk:~/hadoop$ bin/hadoop jar hadoop-examples-0.20.104.1.1006042001.jar pi 10 1000000
For automated runs of hadoop jobs, a keytab file is created under the hadoop-user's home directory (/home/hadoop-user/hadoop-user.keytab). This will allow user to execute the "kinit" without having to manually enter the password. So for automated runs of hadoop commands or M/R, Pig jobs through the cron daemon, users can invoke the following command to get the Kerberos ticket. Use command 'klist' to view the Kerberos ticket and its validity.
For example: hadoop-user@hadoop-desk:~$ cd hadoop hadoop-user@hadoop-desk:~$ kinit -k -t /home/hadoop-user/hadoop-user.keytab hadoop-user/localhost@LOCALDOMAIN hadoop-user@hadoop-desk:~/hadoop$ bin/hadoop jar hadoop-examples-0.20.104.1.1006042001.jar pi 10 1000000
- Running Pig Tutorial:
The Pig tutorial is installed at "/home/hadoop-user/pig-tutorial". Example commands to run the Pig script are given in "example.run.cmd.sh". The Data needed for Pig scripts are already copied to HDFS. See more details about the Pig Tutorial at Pig@Apache
- hadoop-user@hadoop-desk:~$ cd pig-tutorial
- hadoop-user@hadoop-desk:~$ sh example.run.cmd.sh
- Shutting down the VM:
When you are done with the virtual machine, you can turn it off by logging in as the hadoop-user and running the command 'sudo poweroff'. The virtual machine will shut itself down in an orderly fashion and the window it runs in will disappear.
Last but not least, I would like to thank Devaraj Das and Jianyong Dai from the Yahoo! Hadoop & Pig Develoment team for their help in setting up and configuring Hadoop 0.20.S and Pig respectively.
Notice: Yahoo! does not offer any support for the Hadoop Virtual Machine. The software include cryptographic software that is subject to U.S. export control laws and applicable export and import laws of other countries. BEFORE using any software made available from this site, it is your responsibility to understand and comply with these laws. This software is being exported in accordance with the Export Administration Regulations. As of June 2009, you are prohibited from exporting and re-exporting this software to Cuba, Iran, North Korea, Sudan, Syria and any other countries specified by regulatory update to the U.S. export control laws and regulations. Diversion contrary to U.S. law is prohibited.
Subscribe to:
Posts (Atom)