Archive for the ‘Elastic Computing’ Category

Public Cloud is Better than On-Premise and Netflix vs Zynga proves it

November 30, 2014

There is a common myth that for super large scale companies it makes sense to build their own data center instead of using a public cloud.

In many cases I believe this is exactly the opposite. For many companies, time to market, focus and top line are more critical than a theoretical saving 20% on the cost.

The Killing , Wikipedia

Consider Netflix and Zynga. Both companies are large and smart enough to build their own private clouds.

Zynga chose to leave AWS and build their own cloud infrastructure. Netflix chose to stay on AWS, probably with a huge discount.

Their stock price might hint on which company made the right choice.

Netflix Vs Zynga

Netflix Vs Zynga

Netflix focused the company’s energy on moving from a tech company into  a movie production studio with shows like “House of Cards” , “Arrested Development”,”The Killing” and “Orange is the new Black”. Zynga was busy in becoming a data center company. Instead of focusing on social games and preparing for the next big change into Mobile.

The more generic point is that the bottleneck in most companies is a person.  More specifically, it is management attention. If everyone is busy in building a private cloud and purchasing 1000’s of servers, no one has time to create a new business line.

The thought is that a private cloud becomes attractive with huge scale because the number of devops people to write software  has an upper bound.

This might be true, but there are very few people in the world who have already done it, and hiring takes a lot of time.

The other option is to hire inexperienced people, at least on this scale, and they would make mistakes.

Companies like Netflix and Zynga are supposed to have 70-90% gross margin. Reducing cost of hardware from 20% to 15% is nice, but even that is not straight forward. And in any case, it is much less important than losing or creating a new $1B on the revenue side.

 

 

 

 

Advertisements

Can You Make Money Writing Algorithms? – Part III

April 21, 2012

“Money For Nothing and the (K) Nearest Neighbors Are Free”Doctor Mark Knopfler

In 2007 I wrote Why it is hard to make money form algorithms and How new technologies allow making money from algorithms.

Kaggle is a fascinating example:

Kaggle is an innovative solution for statistical/analytics outsourcing. We are the leading platform for predictive modeling competitions. Companies, governments and researchers present datasets and problems – the world’s best data scientists then compete to produce the best solutions. At the end of a competition, the competition host pays prize money in exchange for the intellectual property behind the winning model.

The biggest challenge right now is for $3M , to improve the broken American health care system 🙂 with 951 teams competing at this stage.

If we want to go for higher numbers: google paid $12B for Motorola patents. Most of these patents have probably never been used and are not-so-important-or-smart.

While I believe most software patents are idiotic, this is how the game is played these days. And in some sense it is encouraging that Intellectual property, in the form of Algorithms starts showing its economical value.

And the people with skills are doing well. From  “Big Data Skills Bring Big dough” in GigaOm

If you can claim to be a data scientist and have the chops to back that up, you can pretty much write your own ticket even in this tough job market. A quick search of the popular job posting sites –Indeed.com,SimplyHired.com, or Dice.com – shows a huge demand for data scientists or anyone who can demonstrate other “big data”skills.

And , most importantly, the funniest show in TV right now (except for Fox News) is Big Bang Theory, focusing on Algorithms to making Friends.

The German version is even funnier

Cloud Computing in the Year 2000

October 8, 2011

I came across my old unborn thesis proposal from the year 2000.  The gist of the thesis was to evaluate the economical value of cloud computing 🙂 However, the proposal was rejected by my Professor as not “academical” enough.

This is sort of funny, considering that It was supposed to be in the Information Systems division  of the Management faculty.

Lucky for me. I went back to “Normal” MBA program and got the degree with just seven more courses.

Ancdotes for those who don’t read Hebrew:

  • Google had 4000 server farm 🙂
  • ASP Was the term for SAAS, AIP for IAAS . Not a big difference.
  • 8*Single CPU server configuration was 6 Times cheaper than Single Dell 8 CPU Server

אספקת פתרונות AIP בעזרת מסה של שרתים מבוססי מחשב אישי

בשנים האחרונות זוכה תחום הASP לפריחה ופיתוח רב. אף על פי שיש הטוענים כי התחום הינו OUTSOURCING בלבוש נאה יותר, ניראה כי ההתקדמות בתחום האינטרנט והתיקשורת רחבת הפס , כמו עלות העסקת כוח אדם טכנולוגי בתוך הארגון יוצרים לו הזדמנות גדולה. תחום חדש יחסית הינו ה AIP – Application Infastructure Provider. הAIP אמון על הכנה , תכנון,אחזקה אבטחה וניהול של תשתית חומרה ותוכנה לטובת הASP. ההתמחות של הAIP היא בניהול חוות מחשבים המריצות אפלקציות בזמניות גבוהה מאוד ואילו הASP אמור להתמחות באפליקציות שלהן הוא נותן שרות בלקוחותיו. קיימת גם תפיסה המרחיבה את תחום ההתעניניות של הAIP ליצירת מיסגרת של שיתוף פעולה בין הASP השונים שמשלמים עבור שירותיו. כך יכול ה-AIP לספק ללקוחותיו ערך מוסף מעבר לשירותי המיחשוב .

עבודה זו באה לבחון את הישימות הכלכלית של שימוש במספר רב של מחשבי PC זולים לעומת השימוש בשרתים מרובי מעבדים. עבודה זו תתמקד בסביבה של אירוח אפליקציות ( ASP ) ובמספר רב מאוד של מחשבים. כדי לבדוק האם יש היגיון כלכלי כלשהו בהצעה יתמקד מסמך זה בהשוואת עלויות של שרתי 8 מעבדים ל8 שרתים בעלי מעבד אחד.

מוטיבציה : הרעיון הבסיסי בעבודה זו מגיע משלושה גורמים. המחיר הנמוך להדהים של מחשבי PC בשנים האחרונות הוא הראשון מבינם. עקב המעבר ליצור המוני של מחשבים אלו והשיפור המתמיד בטכנולוגיה ירדה עלות מחשב PC מצוין ל1500$. בנוסף, עקב התפתחות מהפכת החונמה ( FREEWARE ) ניתן להשתמש במערכות הפעלה, שרתי קבצים, שרתי דואר ושרתי WEB במחירים אפסיים. הגורם השלישי הינו קיומם של אפליקציות קלות למיקבול ולחלוקת עומסים, והפופולריות הגדולה של שירותי אירוח אפליקציות. השילוב של 3 גורמים הללו יותר מצב שבו עבור סוגי אפליקציות מסוימות היתרונות היחסיים של שימוש בשרתים מרובי מעבדים נעלמים, או מצטמצמים מאוד ועקב עלותם הגבוהה נוצר יתרון כלכלי משמעותי לשימוש במספר רב של מחשבי PC כאלטרנטיבה.

חיזוק מוצלח לישימות אסטרטגיה זו מצאתי במאמר שהתפרסם על אסטרטגית החומרה של חברת GOOGLE . המאמר מתאר אסטרטגיה דומה למתואר בהצעה זו, המיושמת כבר היום בחברה על כ4000 מחשבים.

יתרונות לארכיטקטורות השונות  : לשרתים מרובי מעבדים יש מספר יתרונות בולט על פני שימוש במספר רב של מחשבים בעלי מעבד בודד. נושא זה הוא, כמובן, רחב ועמוק וכאן נתרכז רק בהיבטים הפשטניים שלו. במודל חישובי מסוג SMP קיים שרת אחד בעל BUS אחד וזיכרון משותף בעל מספר מעבדים הנע מ1 ל 64 מעבדים. לצורך פשטות לא נדון כאן בשרתים מסוג CLUSTER, CC-NUMA ו MPP שהינם פחות נפוצים כיום.

האלטרנטיבות :

לצורך השוואה ראשונית בחרתי לבדוק מחשבים מתוצרת חברת DELL[1]. חברה זו נבחרה משום שהיא מציעה תמחור מדויק וקל לכל המחשבים שלה ומשום שקיימות תוצאות אמינות של מחשביה במבחני ביצועים. בדיקה מדגמית של מחשבי חברת IBM העלתה תוצאות דומות.

היישום שבחרתי לצורך ההשוואה הינו אחסון שרתי WEB הניגשים בעיקר למידע סטטי[2].לצורך הפשטות ניתן להניח כי אנו מאחסנים אתר אחד בלבד, אך זוהי הנחה שקל יחסית לתקן ללא השפעה גדולה על התוצאות.

בקונפיגורציה הראשונה (א’ ) שרת DELL 8450 בעל 8 מעבדים ו16GB זיכרון. השרת יריץ SERVER WEB  אחד . מחיר שרת כזה הינו כ 120,000$.

בקונפיגורציה השניה נשתמש ב8 שרתי DELL 2400 עם מעבד 1 ו2GB זיכרון. כל שרת יריץ  WEB SERVER אחד . כל השרתים יחברו בעזרת  SWITCH לשרת חלוקת עומסים של חברת RADWARE שיחובר לאינטרנט. שרת חלוקת עומסים מתוצרת RADWARE הינו שרת חכם מבוסס חומרה ותוכנה היודע לחלק עומסים של בקשות HTTP למספר שרתים בצורה דינמית. לשרת זה תכונות מתוחכמות ביותר כולל יכולת חלוקת עומסים גיאגורפית והתחשבות בתנאים שונים כמו URL ו COOKIES. מחיר הכולל של קונפיגורציה זו הוא כ 103,000$.

מדד הביצועים שבחרתי הינו SPECWEB . מדד זה בודק ביצועים של שרתי HTTP ומאורגן ע”י ארגון הBENCHMARK הניטרלי SPEC.  לפי מדד זה ביצועי קונפיגורציה א’ הנם 3000 SEPCWEB וביצועי קונפיגורציה ב’ הנם כ6000 SEPCWEB.

יחס העלות תועלת המתקבל מהשוואה זו הינו 2.1 לטובת קונפיגורציה ב’.

אם נרצה להקל מעט על התנאים ונסתפק במחשב בעל 768MB זיכרון בקונפיגורציה ב’[3] הרי שיחס העלות התועלת הינו 4.4 לטובת קונפיגורציה ב’.

בקונפיגורציה ב’ ניתן להחליף את מחשב DELL במחשב  NAMELESS  . מחיר מחשב כזה יכול לרדת עד לכ2000$ ויחס עלות תועלת יהיה  6.5 לטובת קונפיגורציה ב’.

ככלל, ניראה כי ניתן להשיג חיסכון כלכלי של בין2.1 ל6.5 ע”י בחירה בקונפיגורציה ב’. ישנם שיפורים רבים שיש להכניס במודל ומקומות רבים שבהם יושמו הנחות מקלות  אך השתדלתי להפלות לרעה את דפ”א ב’ כדי לקבל חסמים תחתונים לשיפור.

מסקנה מעניינת שניתן לראות כבר מניתוח זה שכמות הזיכרון הנדרשת ליישום הינה גורם המשפיע בצורה חזקה ביותר על עלות התצורות ויחס העלות שלהם. הסיבה לגורם זה היא העלייה הלא ליניארית במחיר כאשר מגדילים את הזיכרון ובייחוד  בתחום שמעל1GB .

ניראה כי היתרון הגדול של קונפיגורציה ב’ יהיה ביישומים הצורכים כוח עיבוד רב וזיכרון מועט עד בינוני.


[1] השוואה זו מטה את התוצאות לרעת שיטת המחשב הפשוט שכן שיטה זו מבוססת על קנית מחשב NAMELESS שיהיה זול יותר .

[2] ניתן לאפשר שימוש בCGI וכדומה אך ההנחה היא שהמחשב לא נזקק למחשבים אחרים

[3] הזיכרון בשרת הבודד ירד בהתאמה

New Version Every Other Week – Part II

February 21, 2011

In part one I covered some principles that allow us to sustain a rate of a new version every two weeks.

In this post I’ll discuss some of the customer facing challenges and how to overcome them.

Enterprise software customers have grown to fear new product versions. Upgrades are as joyful as Freddy Krueger inside children dreams. One would expect that such customers would be very hesitant to change code every two weeks.

In reality , this is not a big issue,for the following reasons:

  • Industry standards – Nobody knows what “version” Google.com or CNN.COM or PayPal.Com  is running.  And frankly , nobody cares. It is a question of accountability, and if the service provider has accountability, the very basic notion is that upgrades are his problem to solve.
  • Trust – it the service performs well for a year, customers trust the update process.
  • Compatibility – obviously, external API’s have to be honored and  backward compatible. But there is really no reason to change them very often.
  • Visibility – since there is no explicit external “version number” ,the customer are much less intimated by changes.
  • Terminology helps. “Updates to the platforms” sounds much better than “Major new release”. But terminology can’t be the only solution. Vendors tried “HotFix”, “HotFix Accumulator”, “Release Candidate”, “Service Pack”, “Feature Pack” and “Early Availability” but customers still hate bugs. Branding alone is as impressive as re-naming the janitor as Chief Operations Officer.
  • Industry Standard 2 – even though SalesForce has only two release in a year, their SLA allows them to have four hours of downtime(upgrade) time every month.
  • Industry Standard 3 – Chrome updates itself without asking the user for permission. Windows updates , which used to be tightly controlled by IT, seems to be working very well for quite a few years.
leonardo dicaprio

leonardo dicaprio

  • Visibility 2 – concerned customers get a deep dive into the multiple safety mechanisms mentioned in the previous post.
  • Communication – as a SaaS provider we know what features are used and by whom. If we know we want to change a feature, we speak to these users before we commit the changes.
  • Isolation – we built an externally strong isolation model in which multiple features can run simultaneously , using only a single code base. This capability allows setting different “virtual release clocks” for every customer.
  • The benefits – in the end of the day, these releases are done to answer customers business needs, not to re-factor code. The customer get a lot of new functionality in an amazing pace, without paying for huge upgrades , software subscriptions or professional services fees.

To summarize, using a mixture of process, technology and adaptive product management turns frequent versions to Leonardo DiCaprio Rather than Freddy Krueger.  BTW, It is worth while reading how some companies deliver 10 versions per day.

Commodity Clouds, IAAS and PAAS – Part II

February 2, 2011

In the first post we looked at some common mistakes resulting in premature “Commoditization” declarations.

In this post we would look at IAAS and PAAS in more detail.

In software, it is rare to have Nobel-prize-worthy-discoveries.  Still, it does not mean all inventions  are trivial. At the high level, analyst point of view, Windows XP, Vista and Windows 7 share the same technology. In the real world, there are many differences. In the real world, Vista was a complete failure although it was “a commodity operating system” and windows 7 was well accepted.

And these days we have people speaking about IAAS ( infrastructure as a service) as a “dying dinosaur” because PAAS (platform as a service) is the new king. They must be kidding.  Lets reconsider the facts.

  • Force.Com, the first  PAAS, is not working out. I don’t know of any major company that built their entire successful new company on top it. The licensing, performance and “Governor rules” caused it to fail. What works nicely inside salesfroce.com did not work well for the rest of the world. Maybe that why they bought Heroku.  Did any of their other acquisitions (DimDim\ManyMoon\Jigsaw\Etacts) run on Force.Com ?
  • VMFORCE.COM does not exist yet, as far as I can tell. It is just a press release , at this stage. When I read through the hype, there is no cloud portability at all, and it still looks like running JAVA on a single server with no scaling or multi-tenant capabilities. The home page seems quite stale.
  • AZURE is not much better off.  At its current stage Azure is more similar to COM+ than it is to .NET . Microsoft has invested so much marketing money on Azure that people think it actually has something that can compete with EC2. In the real world, Microsoft has no solution to run Virtual Machines in the cloud for public access. Their  PAAS solution can not run any of their applications – SharePoint,Exchange,Office, SQL Server Dynamics are all running on  internal IAAS solution, not on Azure PAAS. Wait 3-7 years for this to happen.
  • Did anyone hear of “Facebook” or “Twitter” using any PAAS platform ? Funny, but they are not keen to run their services on their biggest competitors platform. I wonder why.
  • Even Amazon EC2, who is by far the market leader and innovator , has long, long  road for to achieve the core feature set. Seriously. They added user management few weeks ago,only through the API, after four years in production. That’s probably the #1 feature any enterprise expects to find in any software service .
  • No one has really solved the problem of WAN based storage replication (despite bandwidth being “a commodity” 🙂 ).  This is critical for IAAS success in the enterprise.

The most expensive and longest effort is rewriting existing software. There were trillions of dollars spent in coding existing applications. Why would anyone rewrite the same business logic in a new platform, if they don’t need the scale?

VMWARE succeeded because it has great economic benefits without requiring a rewrite. PAAS solution is probably the right way to go in the long run, but might stay marginal for quite long time, IMO. IAAS has  a great start and would continue to evolve, but is far from being a commodity when looking beyond the hype.

Related Articles

[picapp align=”none” wrap=”false” link=”term=platform+shoes&iid=9588154″ src=”http://view1.picapp.com/pictures.photo/image/9588154/year-old-taylor-momsen/year-old-taylor-momsen.jpg?size=500&imageId=9588154″ width=”234″ height=”351″ /]

Does SLA really mean anything?

January 31, 2011

I believe most SLA’s (Service Level Agreements) are meaningless.

In the world of Software as a Service and cloud computing it has become a very popular topic, but the reality is very different from theory.

In theory, every service provider promises 99.999% of availability which means less than 6 minutes per year.

In reality, even the best services (Amazon, Google, Rackspace) had events of 8 hours of availability problems which means they are at 99.9% availability, at best.

High Availability 99.999 Downtime Table

High Availability 99.999 Downtime Table from Wikipedia

Moreover , the economics just don’t make any sense. SLA’s can not replace insurance.

Imagine the following scenario.

E-commerce site “MyCatsAndSnakes.Com” builds its consumer site in “BestAvailabilityHosting” which uses networking equipment from “VeryExpensiveMonopoly, INC.

If MyCatsandSnakes is unavailable, the site owner “Rich Bastardy” loses $100,000 per hour of downtime.

Rich pays BAHosting $20,000 per month and they promise him %99.999 avilability.

BAHostig bought two core routers in high availability mode ,connected to three different ISP’s. Each router costs $50,000 and Platinum support is another %30 per year. So total cost is $130,000 for the first year.

One horrible day, the core routers have a software bug and the traffic to the MyCatsandSnakes is dead.

Since the routers have the same software the high availability does not help to resolve the issue and VeryExpensiveMonopoly top developers have to debug the problem on site. after 8 hours of brave efforts, cats and snakes are being sold online again.

Try to guess the answers to the following questions:

  • How much money did Rich lose? (Hint: $100,000*8 )

  • How much money would Rich get from BestAvailabilityHosting? ( Hint:  (8/(24*30))*$20,000 = $166 )
  • How much money would BAHosting get back from VeryExpensiveMonopoly? (Hint:$0)

The networking vendor,VeryExpensiveMonopoly, does not give any compensation for equipment failure. This is true for all hardware and software vendors.

They don’t even have SLA for resolution time. The best you can get with platinum support is “response time”, which is not a great help.

As a result , the hosting provider can not have back to back guarantee or insurance for failures in networking.

The hosting provider limits its liability to the amount of money it receives from Rich ($20,000 per month), which makes sense.

Moreover, the service provider would only compensate Pro Rata, so the sum becomes even more neglible.

But that does not help Rich at all, as his losses are far bigger. He lost $800,000 of cats and snakes deliveries to young teenagers across Ohio.

The real answer, IMO, is “Insurance”. If Rich really wants ro mitigate his risk, he can buy an insurance for such cases.

The insurance company should be able to asses the risk and apply the right statistical costs model . Asking a service provider to do it is useless.

SLA’s might be a good way to set mutual expectations, but they are certainly not a replacement for a good insurance policy or a DRP.

Here is an interesting review of CRM and SalesFore.Com (lack of ?) SLA . And here is Amazon’s SLA for EC2    and RackSpace.

Amazon: “If the Annual Uptime Percentage for a customer drops below 99.95% for the Service Year, that customer is eligible to receive a Service Credit equal to 10% of their bill”

GoGrid promises 10,000% but “No credit will exceed one hundred percent (100%) of Customer’s fees for the Service feature in question in the Customer’s then-current billing month”

RackSpace promises 100% avilability , but “Rackspace Guaranty: We will credit your account 5% of the monthly fee for each 30 minutes of network downtime, up to 100% of your monthly fee for the affected server.” 

Again, i don’t think one can blame these service providers, but the  gap from the perception seems major.

There are three real answers for customers who want an SLA from a service provider:

1) It would be better than on premise

2) How much are you willing to pay for extra availability?

 3) We have a great insurance agent 🙂



Commodity Clouds? You must be kidding

January 29, 2011

A commodity is a good for which there is demand, but which is supplied without qualitative differentiation across a market. Commodities are substances that come out of the earth and maintain roughly a universal price. Wikipedia

I find it hilarious when some people describe clouds or the IaaS market as a “commodity”, or even worse – “legacy”.

It is a common mistake that I see again and again by people who don’t have a clue in what they are talking about or just ignore the little details.

These are the little details you might call “reality”.

[picapp align=”none” wrap=”false” link=”term=oil+rig+sea&iid=8827400″ src=”http://view4.picapp.com/pictures.photo/image/8827400/file-photo-ocean-guardian/file-photo-ocean-guardian.jpg?size=500&imageId=8827400″ width=”500″ height=”341″ /]

The first point I want to make is that “Commodity” is often misinterpreted as “Easy to Produce” or “Low Margin,Bad Business”.

Take a look at oil production. While the end product does not have qualitative differentiation,its production requires some of the most sophisticated technology available. Drilling oil from the bottom of the sea necessitates huge investments, great science and an amazing technology.

Moreover,  six of the ten biggest companies of the world are in the oil production sector, so maybe it is not such a bad business to be in.

Another example would be X86 chips. The X86 architecture is more-or-less the same as it was 30 years ago. It is available universally and there is no qualitative differentiation between different items. However, building a new FAB costs around $2B and Intel is one of the most successful companies on earth. No one would argue that there is no intellectual property in chip design.

[picapp align=”none” wrap=”false” link=”term=slow&iid=285366″ src=”http://view1.picapp.com/pictures.photo/image/285366/road-leading-the-ocean/road-leading-the-ocean.jpg?size=500&imageId=285366″ width=”380″ height=”380″ /]

The second important point is that vision is nice, but reality is nicer. My friend  told me that in the late 90’s the technologists in Check Point thought that Intrusion Detection technology is an erroneous direction to follow. They thought that comparing signatures of attacks is reactive and it does not help the customer  to passively monitor the attacks.

[picapp align=”none” wrap=”false” link=”term=inception+stills&iid=9386959″ src=”http://view4.picapp.com/pictures.photo/image/9386959/stills-from-christopher/stills-from-christopher.jpg?size=500&imageId=9386959″ width=”500″ height=”333″ /]

While they were right  in their long-term vision, ISS sold hundreds of millions in IDS software ,in the meantime. Moreover, when the market shifted to IPS ( Intrusion Prevention systems) , ISS had good solid technology to start from, which took Check Point  five more years to accomplish. As my father, the CFO, used to say, “The markets fix themselves in the long run, but in the long run we all die”. Technology adoption cycles are longer than they seem.

Some analysts are looking too far ahead. For example, two years ago everyone talked about hyper-visors as being commoditized. Microsoft and Citrix will give it it for free, KVN is for free anyway and VMWARE would have to follow. Surprisingly, in the last 12 months VMWARE sold more than $2B worth of , guess what, hypervisors.

Why are 200,000 customers being so silly and paying so much money when the analysts say differently?

For one reason, because Microsoft Hyper-V does not support NFS, yet, which is probably used by 40% of customers. Because Hyper-V can not handle memory over-commit, which means you’ll get about 30% less capacity from the same hardware. Because VMWARE Virtual Center is two generations ahead of Microsoft’s management server, and there is not much use for a hyper-visor that can’t be managed. See a nice post from 2008 about it.

So are the analysts the stupid ones?

Of course not. But they have not installed a hypervisor in the last five years. Furthermore , they are probably right in the long run. In three years from now (five years from 2008:) ) hypervisors might become a commodity. But it is much slower pace than it seems at first.

Remember how in 2000 Broadband Internet was just around the corner ? We’re in 2010 and only South Korea has upload and download speeds above 20Mbps . More on the commodity subject and especially in clouds in my  next post.

Lady Ga-Ga or: How I Learned to Stop Worrying and Love the Facebook

January 30, 2010

The western world ended quite suddenly.

The news, and pictures, about Lady Ga-Ga actually being a man, were first reported by Steve Jobs as he presented Apple’s new iPlot gadget at a secret location.

127 journalists immediately tweeted the story , and it was soon re-tweeted by 13,068 followers.

[picapp align=”none” wrap=”false” link=”term=lady+ga+ga&iid=6140998″ src=”1/7/b/5/Lady_GaGa_performs_c55e.JPG?adImageId=9661170&imageId=6140998″ width=”500″ height=”612″ /]

The tweets were automatically converted 1675,042  LinkedIn notification which turned into automatic 300,000 WordPress Updates.

Than Google picked the news up and sent alerts to 1,020,068 Lady Ga-Ga followers and 1,002,900,3 day traders.

However, the big problem started as the new automatic “Google Alert” to “FaceBook comments” mechanism kicked in.

Since Facebook comments are automatically generting Tweeter alerts ,a vicious positive feedback cycle was created.

Tweeter->LinkedIn->WordPress->Google->Facebook->Tweeter.

Soon, 95% of the computing power of the western world was targeted at breaking the (false) news to the same people again and again.

When New York  lost its electric power, due to the high consumption by data center. Google decided to cancel Google wave and create a super algorithm to solve the problem.

They took five of their Nobel prize winners, who have been working on JavaScript optimizations, and asked them to solve the problem.

Google Geniuses quickly realized the problem is similar to solving the “ipartite graphs with no induced cycle of length > 6”  problem, but just when they were ready to solve it, the network on their Android t-Mobile crashed. The only person to hear about Amazon’s EC2 explosion  was President Obama, with his secure Blackberry.

As San Francisco,Tel Aviv, Rome and London lost all electric power the mob started rioting the food supplies. Unfortunately they starved after two days because all of the food was organic.

Luckily , China was saved, as Google decided to block them, or vice versa.

Bar Refaeli, DNA Sequencing and Cloud Computing

December 7, 2009

Much like Bar Refaeli and Leonardo DiCaprio, DNA Sequencing and cloud computing go hand in hand together.

[picapp align=”none” wrap=”false” link=”term=Bar+Refaeli&iid=3965233″ src=”5/e/7/5/PicImg_Sports_Illustrated_Swimsuit_a842.jpg?adImageId=8071751&imageId=3965233″ width=”390″ height=”594″ /]

I had a  very  interesting conversation with a friend yesterday about DNA Sequencing and cloud computing.

My friend is leading one of the largest cancer genome research projects in the world (and  yes, he is extremely  bright).

It appears that there is a great progress in DNA sequencing technology, based on chemical process. The pace is much faster than Moore’s law. As a result the budgets are shifting from the chemistry side to the computational side.

In the past, the budget would be 90% for biology and 10% for analyzing the data coming our of the DNA.

As the sequencing costs have fallen by orders of magnitude there is more and more data ( a single patient genome data is one TeraByte).

The more data , the more computing power needed to analyze it and hence the budget split becomes 50-50.

Each computation can take up to 24 hours, running on 100 cores mini grid.

[picapp align=”none” wrap=”false” link=”term=DNA&iid=7062711″ src=”c/a/5/d/SCIGENOME_737a.JPG?adImageId=8071402&imageId=7062711″ width=”500″ height=”332″ /]

In theory, such tasks are great for cloud computing IAAS (Infra Structure as a Service) platforms or even PAAS (Platform as a service) solutions with Map-Redux capabilities.This EC2 Bioinformatics post provide interesting examples.

In practice there are three main challenges

  1. Since Cancer research facilities need this server power everyday, it is cheaper for them to build the solutions internally.
  1. To make things even more challenging, the highest cost in most clouds is the bandwidth in and out of the cloud. It would cost $150 to store one patient data on Amazon S3, but $170-$100 to transfer it into S3.
  1. Even if the cost gap can  be mitigated, there can be regulatory problems with privacy of patients data.After all its one person entire DNA we speak about. Encryption would probably be too expensive, but spiting and randomizing the data can probably solve this hurdle.

So, where do clouds make most sense for this kind of biological research ?

One use case is the testing of new improved  algorithm. Then, the researchers want to run the algorithm on all the existing data, not just the new one.

They need to compare the results  of the new algorithm with the old algorithms on same data set.They also need to finish the paper on time for the submission deadline :).

In such scenarios there is a huge burst of computation,needed on static data, at a very short period of time.Moreover,  if the data can be stored on shared cloud, and used by researchers form across the world, than data transport would not be so expensive in the overall calculation.

These ideas are fascinating and hopefully would drive new solutions, cures and treatments for cancer.

[picapp align=”none” wrap=”false” link=”term=genome&iid=96824″ src=”0093/03895531-6d57-46bd-a1ad-def577b31174.jpg?adImageId=8078279&imageId=96824″ width=”500″ height=”333″ /]

Why Won’t The Big Big Giant Eat You for Lunch ?

October 11, 2009
Oh, man! We killed Mr. Burns! Mr. Burns is gonna be so mad! – Homer Simpson
Big Giant Stepping On A Small Company

Giant Stepping On A Small Company

One of the most annoying questions I had to answer in last couple of years was “Why can’t Cisco\IBM\Microsoft\VMWARE\HP easily copy what you do?”

To some extent, it is another variation of the annoying “What’s your intellectual property?”

Both of the questions are studied in first year MBA courses  .They seem to make sense at a first glance, but I would try to show they are highly overrated questions.

The underlying assumption is that BBG (Big Big Giant) can use its amazing resources, huge capital , loyal customer base and brand to kill any small company if the small company does not have a great barrier to entry, which is typically a technological one.

Having worked in few BBG’s and couple of Start-Up I beg to differ. The giants tend to fail themselves.

StartUp Beats Big Big Giant Corporate

StartUp Beats Big Big Giant Corporate

Lets start with some questions:

  • Why was Sun unable to succeed with its own firewall (SunScreen?) when it tried to stop OEM’ing Check Point’s?
  • Why was Check Point repeatedly unable  to take the SOHO firewall market  (FireWall-First, Small-Office, Safe@Home,Secure-1) ? NetScreen took if from CP. Then Fortinet did the same thing for NetScreen.
  • Why does Microsoft still lack  significant footprint in the Firewall business?
  • Why does Microsoft  ten billion dollars research budget fails to copy Google’s search algorithm for ten years?
  • How come Google Video lost to YouTube?
  • Why is VMWARE leading over Microsoft in virtualization?  Microsoft acquired Connectix in 2003. Connectix virtualization technology was almost as good as VMWARE’s at the time. Today there is a big gap in market share.
  • How come IBM , with years of building super computers does not have an elastic cloud solution?
  • How does small Riverbed perform  so well among the networking giants?
  • What was the huge intellectual property in Windows that OS2 lacked?

Here is a hint to my proposed answer to why giants fail, details to follow on part II :

  1. Time.
  2. Focus.
  3. Execution.
  4. Constraints.
  5. Culture.
  6. Investors.
  7. Golden Cage Syndrome.