Will you fire half of your friends in 7 years?
Labels: social network, sociology
| |||
Wednesday, June 03, 2009Will you fire half of your friends in 7 years?
Interesting study showing that the size of personal network remains stable over the years though “half of all friends” are replaced every 7 years.
“The results showed that personal network sizes remained stable, but that many members of the network were new. About 30 percent of discussion partners and practical helpers had the same position in a typical subject's network seven years later. And only 48 percent were still part of the network. This finding goes against previous research which had showed that social network sizes are shrinking.”half of all friends are replaced every 7 years Labels: social network, sociology Tuesday, April 21, 2009MySQL at Google by Mark Callaghan
Mark Callaghan is taking the stage to present his Key Note at the MySQL Conference and Expo, “This is not a web app: The continuing evolution of MySQL at Google.”
I am going to take notes as fast as I can. Excuse any typos etc. Mark worked on DBMS internals at Informix and then at Oracle. He worked on embedding BerkeleyDB at a startup. He joined Google in August, 2005. At Google his team is working to enhance MySQL and to support a large production deployment. He blogs at mysqlha.blogspot.com and has helped publish Google patch for MySQL at code.google.com. In addition, he agitates for MySQL. What is MySQL at Google? He will give details but some numbers he won't give. It is a large MySQL deployment. The QPS rate is tremendous. The number of machines they use is reasonably large. MySQL is used in a large, important enterprise deployment. They run many commodity machines. Google depends on replication, InnoDB and stability. MySQL is sharded with many replicas per shard. At Google, database service must always be available. They have been successful with it and happy with the results. The database itself is providing change management. If you just push changes, you are more than likely to have a debugging nightmare. A number of replicas can be connected to MySQL without crashing the master. You'll be surprised at how many replicas can be deployed. MySQL is solid and easy to improve. InnoDB from Heikki Tuuri and company is amazing. Inspiration provided by Yasufumi Kinoshita and Percona. InnoDB is the most beautiful database software Mark has worked on and he has worked for a few database companies. Prehistory
- MySQL 4.0 and Innodb arrive Consistency matters most. When chosing between consistency and availability, you want to be consistent. You shouldn't have two servers claiming to be masters. Generally, the full schema is understood by few people. Audit is a big concern. Who is doing what change? Legacy is a another concern. Control is an issue. You have to show you can control access to the database. Finally, the focus is on transactions, they don't want to lose any data. Data quality is important to Google. How do we build this? A bad build ruins everything. He inherited a dedicated build machine. They moved to hermetic builds and cross-compilation fun and eventually learned to love autoconf. How de we test this? MySQL has a suite of regression tests but they are easy to pass. They have queries running in production, how can they use those? They sample queries in production using a Python script and then replay them to simulate sample production workloads. They built stress tests generally around replication. If you kill a slave, it can come back and start from where it left off. Use valgrind Eventually they realized that MySQL has valgrind and started using it. They also discovered the value of compiler warnings. How do we deploy this? Simple approach is put it out there and hope for the best. Search of error log files is automated. On a daily basis, crashes are categorized. Machines are removed automatically removed from service. Finally, they have automated replacement of machines. How do we monitor this? He has a feature request: SHOW USER STATISTICS. They archive SHOW PROCESSLIST and SHOW STATUS. Add SHOW USER_STATS and SHOW TABLE_STATS. It's amazing what you can do with awk and bash. They prefer to take a top-down approach for monitoring. They generate daily and weekly load reports, including QPS. QPS on critical servers was going 2x per year. After deploying a better monitoring tool, they determined it was queries that weren't really crucial for those servers. How do we improve this? Understand your problems and deploy what you build. If you are just building and not deploying, you are not going to learn the tradeoffs. Also, monitor to learn what the problems are. Replication features added At a high level, they are slowing moving towards self healing. At a low level, somewhat crash-safe slaves. They use mirror binlog which keeps a copy of master binlog on slave. Other fewatures include semi-sync replication, binlog event checksums and global transaction IDs. They are currently in the process of having fully crash-safe slaves. They have monotonically increasing global transaction IDs. Performance features added high level
low level
high level
Row-change logging
Online data drift: - how do you compare continuously updated tables? - technique is similar to mk-table-checksum - deployment is more complicated More to life than software development Engineers at Google
Production crises
new features
new hardware
out of time. :) Thank you for sharing, Mark! This was an informative session (of course, it would be great to actually get some numbers but still ...) Labels: google, markcallaghan, mysql, mysqlconf Friday, March 27, 2009Tweetbook: Publishing Your Tweets Into A Book
James Briddle has published his 4100 tweets (posts on Twitter) in a 270 page book, "My Life in Tweets." The book covers his tweets from February 2007 to 2009. He has also provided the script he used to download all his tweets from Twitter.
. In case you are wondering what was his motivation? “When Twitter is inevitably replaced by something else, I don’t want to lose all those incidentals, the casual asides, the remarks and responses. That’s all really. This seems like a nice way to do it, and I’ll probably do it again in a couple of years time.” I think it's a very neat idea, even though I doubt I have the patience to read someone's 4100 tweets in a book. However, I think many people will follow this and soon we'll start seeing tweet stands which will just carry books and magazines containing tweets :) Hunch: Caterina Fake's New Killer Startup Tired of using rock, paper, scissors, spock, lizard to make your decisions? Well Caterina Fake's (Flickr co-founder) new startup, Hunch, is coming soon to help you. At the heart of Hunch are decision trees that allow a human user to go through a series of questions in order to make decisions. Caterina says: “Look. Decision-making is difficult, and decisions have to be made constantly. What should I be for Halloween? Do I need a Porsche? Does my hipster facial hair make me look stupid? Is Phoenix a good place to retire? Whom should I vote for? What toe ring should I buy? It's dark and lonely work. Coin-flipping, I Ching consultation, closing your eyes and jumping, postponing the inevitable, Rock-Paper-Scissors, and asking your sister are all time-honored means of coming to a decision -- and yet we think there's room for one more: Hunch. Hunch is a decision-making site, customized for you. Which means Hunch gets to know you, then asks you 10 questions about a topic (usually fewer!), and provides a result -- a Hunch, if you will. It gives you results it wouldn't give other people. ”Caterina Fake ![]() Read Write Web writes: “While we know very little about the inner workings of Hunch, it apparently combines decision trees with a fair amount of end user personalization in the form of questions it asks people visiting the site. These questions allow Hunch to form affinities with other users who ask similar questions. On the back end, contributors will be able to create topic areas (called Super Questions) and add questions and results underneath those topics. How much control you will have or how the interface looks for this we aren't sure yet.” Hunch seems like a brilliant idea but it would take quite some effort and time for it to go mainstream. Labels: caterina-fake, decision-trees, flickr, hunch Thursday, March 26, 2009Create free online charts with Google Chart API
I came across Google Chart API today which lets you create charts online for free*.
Supported charts include: 1. Line charts 2. Bar charts 3. Pie charts Updated! 4. Venn diagrams 5. Scatter plots 6. Radar charts 7. Maps 8. Google-o-meters 9. QR codes The sample charts look really cool. Venn diagram Pie chart Radar chart * If you plan to call Google chart API more than 250,000 times a day, you should let the chart developers know by emailing chart-api-notifications@google.com Labels: chart, google, piechart, venndiagram Friday, March 20, 2009Google Ventures is almost here
Rich Miner, creator of Android, is moving to jump start Google's latest venture, Google Ventures, which will provide capital to startups.
Google has been working on this for quite some time. In 2007, Business Week reported that Google is going to take on a new role as a Venture Capitalist. WSJ reported last year in July about Google's new venture capital arm. TechCrunch thinks Google Ventures is a bad idea and that "Starting a venture fund is not really the best use of Google’s capital." I disagree with TechCrunch mostly because I think it can be an effective way for Google to outsource innovation and give some of the big billions back to startups. Labels: google, googleventures, startup, venture capital Tuesday, March 17, 2009Cloud Computing and Community One East
Community One East is happening this week and I for sure will be attending. The event is taking place at Marriott Marquis Hotel, New York, NY. I am especially looking forward to the announcements tomorrow which sound very interesting :). Unfortunately, I can't go into details about what Sun Microsystems is announcing.
The first day of Community One is a free event featuring. The second day of the event is focused on Deep Dives with two half-day sessions on MySQL and two full-day sessions on Java and Web development. I will be attending the session, "Using Java EE and SOA to Architect and Design Robust Enterprise Applications." Following the conference, I will be a panelist at a Cloud Computing Panel, "How Cloud Computing Affects Small Business," at Microsoft office in NY. After the cloud computing panel, I must rush home to attend a conference call. It's going to be a long but exciting day! It will be great to catch up with old and new friends at the event. I will also be Twitter-ing the event on my Twitter account. Labels: cloudcomputing, conference, java, mysql, soa, sun, twitter Monday, March 16, 2009Java RESTful Web Services with Restlet and IDEA
I recently started using IntelliJ IDEA. To get familiar with the interface, I decided to create a Hello World REST web service with Restlet. For those unfamiliar with Restlet, it is a lightweight REST framework that allows you to quickly and easily create web services using Java. A screencast is available on Restlet website to help you get started.
Some other useful links for beginners: - Using Restlet with IDEA: provides information on creating a new workspace for Restlet. - Building an ATOM (Apache Tomcat, OpenEJB, MySQL) stack - Developing REST web services in Java - REST on Tomcat tutorial using NetBeans - How does Axis 2 differentiate SOAP requests from REST? Install Tomcat 6 on Mac OS X 10.5
I needed to install Tomcat 6 today on my Macbook with Leopard. The installation was very easy. I simply downloaded the binary core, extracted the software and moved the files to my desired location. That's about it. It was pretty painless.
More in depth instructions on how to install Tomcat 6 on Mac OS X 10.5 are also available. |
contact meEarlier
Popular Posts
Popular TagsConferencesPhotosblogsFriends
Sites of InterestMy Professional Interestsrespected sites
Archives
| ||