神刀安全网

Apache's Wacky but Winning Recipe for Big Data Development

That didn’t turn out so well. While people liked Lucene, they found the GPL (which uses a copyleft” approach) restricted use of the software in their businesses, and so they started complaining to Cutting. “It frustrated me because I was trying to give something to people so they could use it freely, and they were saying they couldn’t sue it.  Clearly this was the wrong license,” he told the audience of about 500 in the Hyatt Regency ballroom.

“Then some folks approached me from this wacky place called Apache,” he said. “They said ‘You can join us at Apache. We have a different license and you can join us here.’ I said, What the heck, seemed like a good thing to try.”

And the rest, as they say, is history. Lucene, of course, went on to become the most widely used search engine in the world. “I’d like to think it’s [so popular] because it’s technically awesome, but that is clearly not the reason,” Cutting said. “The real reason is much more likely because it’s benefited from Apache’s approach, from the open software approach in general, and Apache more specifically.”

Cutting’s Apache experience with Lucene would impact his next open source project, a little Web search engine he called Nutch. When Cutting and his development partner, Mike Cafarella, needed more computing resources to scale Nutch, Cutting joined Yahoo . When the duo needed more skilled programmers to help develop Nutch and to make it scale to big data heights, they looked to the Apache Software Foundation.

Apache's Wacky but Winning Recipe for Big Data Development

“Nothing is sacred,” says Doug Cutting (image courtesy YouTube )

Nutch, of course, would go on to become Apache Hadoop, the big data platform that today rivals Linux in terms of open source success. Cutting’s fateful decision to give the ASF a shot with Lucene turned out to be a catalyst that would kick start the modern big data movement.

“This was a real blessing for me, that this Apache process was an accelerant for software,” Cutting said. “It could help software become better and succeed and become a standard in a way that other methods couldn’t.”

Cutting lauded the hands-off approach of the ASF, which he says enabled Apache Hadoop and related projects to evolve according to the needs of the users/developers themselves, not by some far-off organization or corporation that’s removed from the day-to-day challenges of actually running this stuff.

This decentralized approach reduces the friction to innovation. “We have this process where there are random mutations sprouting up all over,” Cutting said. “Some of them end up in the incubator and become top level projects at Apache. But mostly what happens is that people start using them and they decide which ones work.

“It’s a very organic process of improving and selecting the next thing,” he continued. “It’s not set by vendors. It’s not set by foundations. It’s set by users and I think that’s a wonderful change, and I think it’s leading to not only faster change, but change that’s more directed to the problems that we already care about, and where they need solutions.”

The Apache community’s unyielding commitment to change gives traditional enterprise IT professionals the cold sweats. It’s not that development is chaotic at Apache—as Cutting said, the group manages to harness the energy of its participants to address the pain points in a relentlessly directed fashion. But the ruthless dedication to continuous improvement at Apache rubs the suits the wrong way, which is a dynamic that’s a major factor in the controversy surrounding the Open Software Platform Initiative (ODPi).

Cutting is still against the ODPi, even after the ODPi apologized, clarified, and monetized the ASF to the tune of $40,000. In his view, the ODPi threatens the Apache way, and that isn’t acceptable. Over his 15 years of working within the ASF, Cutting has become a fearless soldier for the ASF’s process, to the extent that he would even be willing to see Hadoop supplanted by whatever comes next.

“Nothing is sacred,” Cutting said. “Any components can be replaced by something that is better.”

Apache's Wacky but Winning Recipe for Big Data Development Cutting is a living embodiment of the Apache process. The Cloudera ‘s chief architect hit it out of the ballpark with game-changing software not just once, but twice, and he’s willing to relegate it to the legacy dustbin for the chance at creating something better? This level of institutional purity gives Cutting star status among the open source faithful, but it’s fair to say that it also scares the heck out of CIOs at multi-billion corporations who just want IT platforms that don’t change on a monthly basis.

But in Cutting’s view, this is no time for stability if it means passing on the opportunity to have an even bigger impact. If you like what the ASF has given to you up to this point—given to you, for free , mind you—then just wait till you see what’s next.

“The pace of change in the big data ecosystem is astronomically greater than we saw for the 20 preceding years,” Cutting said. “We’re really now benefiting, and the way this change is happening is a key to that.  It’s decentralized change. There’s no one organization or handful of organization deciding what are the next components in the stack.”

Luck played a big role in Hadoop’s success, Cutting said. If he wasn’t already developing Nutch, and if he hadn’t have read Google’s GFS and MapReduce white papers after struggling to make it scale, Hadoop may not have come to pass.

But also key to that is this Apache process. “Without that, we wouldn’t have been able to get these users involved to build the ecosystem that is now flourishing,” he said. “What’s coming next?   What are the next hot technologies? I don’t know. If I knew I’d work on them or invest in them.

“That’s the wonderful thing–nobody knows.”

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Apache's Wacky but Winning Recipe for Big Data Development

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址