Latest Entries »

images

Overview

I recently finished reading a book by Sunil Soares:  Big Data Governance.  The whole “Big Data” topic has been exploding – so I’ve done a lot of research into the area.  With my background in data architecture (which inherently recognizes the value of data) the concept of applying Data Governance principles to Big Data was interesting.  So I broke down and spent the $30+ dollars on the Kindle book so I could better absorb this concept for possible use in my professional career.

Sunil Soares used to work for IBM and was one of the authors of a free eBook: the IBM Data Governance Unified Process.   I read that eBook previously and was able to absorb some of the material (I need to re-read to get more out of it).  It felt like some of the information in this book was very similar to what was written in that free eBook (so if you’re cheap like me you may want to start there).

As usual I’m going to talk about what themes I saw/learned from reading this book (which could have come from other literature by him and others).   The book itself has a pretty organized outline of the steps he would recommend for various data governance principles relating to Big Data.  You can get most of that by simply reading the Table of Contents of the book so I’m not going to repeat it here.

I believe that the “Big Data” phenomenon itself has clearly demonstrated the value of data itself -of what value it can provide.  That said – it’s the results of insight where the real business value is.  Sunil Soares says: “its value must be truly understood and unlocked by deriving insights that are revealed through analysis and then translating those insights into information, knowledge, and ultimately action.”    Companies are finding profound ways to take data – from many sources – and derive insight they never could before (either it was impossible or too expensive).

What is Big Data Governance?

Sunil Soares defines Big Data Governance as follows: “Big data governance is part of a broader information governance program that formulates policy relating to the optimization, privacy, and monetization of big data by aligning the objectives.”  This definition implies that Big Data Governance sits in the context of an already existing Data Governance program.  Therefore it seems that the author is saying that most of the principles for Data Governance in general would apply to Big Data Governance.  I’m not sure that I agree with that – but we shouldn’t ignore what has been learned in traditional Data Governance programs.

Why Big Data Governance?

I think the first question to answer is why you would even want Data Governance as part of a Big Data program.  Some of the promise of Big Data analytics is that you don’t have to do all the traditional Data Warehouse work to get results.  The concept is that you can load the raw data into your Hadoop environment and perform some advanced analysis and wala –  out comes meaningful results.  You don’t need to perform rationalization, cleaning, summarizing, etc. – you just work with the data as it is.

While some of that is true – it’s somewhat of a misleading picture.  It’s true that one of the advantages of putting all the data into the Hadoop environment is that you don’t have to rely on a sample of the data (which may not represent well the whole).  For example: if you we’re provisioning bandwidth for a set of servers and used an averaged sample of the currently used bandwidth you may miss the occasional spikes in bandwidth that really drive what’s needed [Scott Kahler explains this better than I did in this keynote video: http://www.kcitp.com/2012/09/03/big-data-kansas-city-technology-events/].

The other major difference is including far more variety of sources in your analysis – including segments of unstructured text.  The technology now allows us to efficiently process through far more volumes and variety of data than we could before.  We’ve also advanced in how we can process text and other varieties of data – in terms of our algorithms and other advanced processing. The ability to combine so much data together to get a picture is fascinating.

That said – it’s not quite that easy or simple.  When we say unstructured data what we often mean is that there is some unstructured text within an otherwise structured container.  That structure may not be as rigid – in that maybe not everything is present or it’s more variable -but it still has a structure.  Therefore an effort needs to be made to understand that data – especially in terms of it’s reliability.

Here are a few examples of data that may not be what it seems:

  1. User names in Social Media – they’re not always a real name.  In some social media sites there is no guarantee that the user name is a person’s real name (or any real name).  This is significant as one of the goals is often to tie a master customer record to their social media data.
  2. Sunil Soares mentioned the term “unique visitors” (in the context of clickstream data).  One site/source may measure the # of unique visitors a week vs. another measures it within a month.  If you directly compared this data without addressing this you would get skewed results.
  3. Let’s say we have a measurement that represents the average temperature for the last hour.  If one measurement was a rolling average (taking into a large # of previous values) vs. another is only for that hour.
  4. Location data – does each data source assign the same meaning to the same value?  If you matched data solely on the values would you really be matching the same location?
  5. Another of Sunil Soare’s examples was sensor and part terminology in railroads.  If we can determine that sensor event #282 typically occurs before part #339 fails does that part have the same # in different cars/engines?  Do the different sensors produce the same code for the same event?  Would we need some type of cross-reference table to map these together?
  6. At a higher level consider whether the same data is being pulled into your Hadoop environment multiple times?  Is Data Source Q really the same as Data Source A?   Did we end up wasting storage space, transmission and possibly licensing cost on duplicate data?

There are another dimensions of concern that are not technical – but a function of the complex and inter-related environment we all live in:

  1. Privacy – despite fact some think that privacy is dead there are serious concerns around privacy and Big Data.
    • Consider who really owns the data?  Is it yours or the customer?  Most social media sites will tell you that the data is the customer’s – you can’t own it (you may even have to delete it if they ask you).
    • Are you, by combining data, create new types of sensitive data that didn’t exist before
    • Have you built safeguards into your Big Data platform to control who has access to what (security is not part of the native Hadoop platform)
  2. Regulatory – regulatory agencies don’t care how the sensitive data is stored (i.e. Hadoop) – they will hold you accountable regardless.
    • Are you in a highly regulated industry such as HealthCare?
    • Are you dealing with sensitive corporate data governed by regulation?
    • Do you have industry constraints – such as PCI (credit cards)?
    • Do you know what the regulations are in each country you operate in (they are often different)?
  3. Reputation
    • Even if something is legal – it may not look very good in the eyes of your customers or partners
    • You must weight the risk of the impact to your reputation vs. the revenue potential

What is different about Big Data Governance?

So the next question is whether “Big Data Governance” is really any different than traditional governance for operational or enterprise reporting systems.   I believe it can and should be different – as it’s often for a different purpose.  Sunil Soares puts this well:

Big data needs to be “good enough” because poor data quality does not necessarily impede the analytics that are required to derive business insights.

You may have heard of ETL (Extract, Transform, Load) but now there is a new term:  ELT (Extract, Load, Transform).  At it’s simplest the concept is that the data is loaded in its raw form and then transformed – not the other way around.  This is possible due to the fact we can both afford to load the raw data and have the computing power to transform it in place.  Therefore data quality may be enforced on the fly – instead of before the data is at rest.  So the focus is on doing a reasonable effort on the data that’s imported instead of making it pristine before it’s loaded.

How do I implement Big Data Governance?

The next question is then what’s a framework for implementing Big Data Governance.  Here are some of my thoughts (hopefully organized enough to be useful):

  1. Know your Data.
    • Catalog your internal and external data.  Other than for a sandbox don’t let data into your Big Data platform unless it’s cataloged
    • Understand your data – not in complete detail but the overall quality, time scale, etc.
    • Document some of the key fields within your data – ones that aren’t intuitive and that are key to using the data effectively.
    • Develop a method to document and share this metadata
  2. Know your organization and your platform.
    • Understand who can be involved in data quality – both at it’s source and while it’s in your Big Data platform
    • Understand what your platform can do – good and bad.
  3. Understand constraints, regulations, etc. – especially by region.
    • Understand your legal, ethical and internal constraints.
    • Evaluate these by region – as they can differ greatly
    • Understand what your organization’s commitment level is regarding platform and people resources
  4. Determine what data needs to be cleaned up and what needs to be protected.
    • Flag data that needs to be cleaned and why it needs to be cleaned
    • Flag data that is sensitive and needs to be protected.
    • Develop a method to document and share this metadata
  5. Determine how and when to clean and protect the data.
    • Will you clean your data before it hits your Big Data Platform, after it hits it or in real time?
    • Determine strategies for cleaning that data
    • Determine strategies for protecting sensitive data and overall security schemes
  6. Evaluate how you are doing on a regular basis.
    • Establish routine meetings (quarterly, yearly, etc.) to evaluate how things are going
    • Create the expectation that this is a process and that changes will be common

In conclusion I believe introducing Data Governance into a Big Data environment is a worthwhile choice.

Steve-Jobs-by-Walter-IsaacsonI recently finished reading the book “Steve Jobs” by Walter Isaacson after checking it out from the library.  I’ve been wanting to read this book for some time – as Steve Jobs is such an interesting figure.  He had a profound impact on technology and how we use it.

Reading this book for me was a bit of journey into my past by remembering the technology.  I remember playing Oregon Trail on an Apple II (e I think) in Elementary School [Sidebar - I got distracted from writing this post by playing that game in an emulator.  I'm not sure I ever won at school either..]

I was born in 1973 so I remember a lot of the technology of early PCs – having experienced many of them. Despite the fact I grew up with this technology I still find it hard to believe how limited it was compared to what we have today.  In this day and age computers have become more of a commodity and not a special thing.  But back then, any computer was a work of art and full of amazing technology to accomplish what it did.

It is the character of Steve Jobs that is the most interesting part of this book.  He was a very unique individual – almost a force of nature.  He had such an impact on the technology we use today – not just our computers but on so much consumer technology.  He had a vision for what we needed before we knew it – an instinct that defied logic.  He made possible what many thought was impossible – and we all benefited from it.

That said – would I want to be him?  No – he’s not my role model as a person.  I don’t think I would want to work for him either – given how he treated people.  In fairness he was able to get people to do more than what they thought they could do – to perform at a high level.  I’m sure there are many people who still remember him fondly as a boss and as a person – but many others do not.  Personally I couldn’t treat people the way he did – it’s just not who I am.

I think “innovative” people like him are often over focused on their vision for the future to the point they forget there are real people are around them.  In Steve’s case it was clear that he had what they call a “Reality Distortion Field” – his own view of reality.  In some cases it worked for him – making things happen that seem impossible.  In other cases reality caught up with him and with others around him – often with tragic consequences.

Maybe one of the things he did well was take technology and make it usable for “normal” people.  What often happens with technology is that it’s made by “geeks” – who don’t think like normal people.  Therefore what comes out is designed to work they way a “geek” would want it – which often is not what a normal person would want.  A “geek” designs for edge cases (rare but difficult cases) without putting full effort into the common cases.

Steve made technology elegant and usable – attractive to “normal” people.  He made the concept of a Graphical Interface possible on computers.  He made a music player that transformed the music industry  - one that made others look weak (oh – and being able to buy music online).  He ushered in a mobile computing era – with the iPhone taking the world by storm.  He even finally designed a computing tablet that really worked – after so many others failed.

The world will miss his innovation – his focus on design and usability – not just capability.

1159613_85120857The other day I felt like swimming in a sea of data – that it was all around me. I was looking at some Vending machines and noticed what looked like an old fashioned antenna on them.  From a previous conversation with a maintenance technician I knew the vending machines are wirelessly connected back to their office.   He told me they knew everything that happened on the machine – every keystroke/transaction that occurred.  The primary purpose of that was to prevent fraud – as there is an audit trail to reconcile the amount of money brought back to the office from the machine.

I realized though they could learn a lot more from the data they are gathering than how much money should be in the machine.  They could tell if an item needed to be restocked and predict what items would sell well in the machine.  But that’s just the beginning – as they can learn about us from our purchases.

Imagine what they could do if they aggregate that data across many machines.  What could they predict – what could they learn? Would they be able to see trends across a metro area – or possibly the country?  Would they be able to tell about the health direction of the employees of a company based on what type of items are purchased more frequently?

What value would that data have – would anyone want to purchase it?  Could it assist with the supply chain – would that data be valuable to their suppliers?  What other data could it be correlated with to add additional insight?

These days there is so much data – those magical ones and zeroes – floating through the air. These streams of data about us and others are constantly surrounding us – flying through the air around us.  Our phones track our location, cable boxes track what we watch, websites gather and share information (ever seen ads about things you’ve searched on follow you?), who knows what the government now can track…

The world is changing – in subtle and profound ways – by what data we and others have. Do you get your bills electronically or by the mail?  When was the last time you “developed” a picture (not sure my kids even understand that concept).  Meters are going electronic – parking, electric, gas, water – with real-time feedback to the companies that manage those resources.

Sometimes when I stop and think it feels like there should be the streams of 1s and 0s flying through the air around me.  Maybe I’ve just watched too many movies like the Matrix and Tron or maybe it’s because through my life I’ve seen so much change in technology.  When it comes to data a simple example comes to mind: how much bandwidth is now available.  In college I remember a 56K modem being fast – now I can get 10 meg on my phone.  When my first daughter was born I had to write back to tape some of the edited video as there wasn’t room on my hard drive.  Now I’m going through and extracting the video off of all the raw DV tapes and the DVDs I created as I have the hard drive space.

So maybe it’s just me – but then again maybe it’s all of us.  What do you think?  Are you swimming in data?

galaxys3The other day my phone left me – taking a trip without me.  `I got to my office and realized I had the phone holster but no Galaxy S3 in it.  I looked around at work and then took a trip to my car thinking it had fallen out in the car.  It wasn’t in the car – which wasn’t a good sign.  I tried calling the phone -thinking that I may have left it at home – but no one answered. I happened to have my laptop with me at work that day so I powered it up and went to the Lookout website.  Lookout is an Android security app – that both scans apps and provides some other security features.  One of the unique features is the ability to locate your phone – to a large degree of accuracy.

I have used Lookout for some time – as it’s a free app that seemed to help with security.  I actually used the locate feature once on my previous phone when I thought I had lost it. I “located” my phone from my house and saw that it was near where I work in Downtown Kansas City.  That night I drove back to work and found my phone right next to where I park – a happy outcome.

lookout

This time around the outcome wasn’t quite so happy – as my phone wasn’t nearby.  After I clicked on locate in the website it showed me where my phone was – at around 31st and Linwood – which is over 5 miles away from where I work.  That’s when I knew this didn’t look good – as I definitely didn’t drop it around there – someone or something was moving it.  I tried both making the phone “scream” (quite annoying) and calling the phone multiple times.   Lookout has some advanced options – which require upgrading to the premium version ($).  After upgrading I decided to “lock” the phone to safeguard the data I have on the phone (it was setup to require a password to unlock the phone normally).

Strangely enough I located the phone again and it had moved – to the public library near Prospect.  I made the “phone” scream again to try to get someone’s attention. I even called the library to see if anyone had turned in a lost phone.  At this point I decided to “wipe” the phone to be on the safe side – as I was afraid whoever had it might turn it off.  After wiping it I could no longer track it so it was essentially “lost” at that point.

I had signed up for the insurance through Sprint as I knew this was still a very expensive phone. After starting that process I found out there was a $150 deductible I had to pay out to get it replaced. The irony is that I only paid $50 for the thing – due to a Black Friday deal.  The process of replacing the phone was fairly quick and painless -part of which had be printed off.  During the day I was able to complete part of the process – and fax/upload the documents in the evening.

The phone arrived at my house the next day before I went home.  After I got home from AWANA with my kids I started setting up the phone.  It seems like it gets easier each time to get the phone back up and running – especially with Google as the key player for the phone.  I love having my contacts centralized through Gmail – as it’s the same list everywhere.  I also had Dropbox this time as a repository of the photos from the phone – which was great.

Since I had to buy a new case I decided to get something with a more secure holster – as I think my phone fell out of the one I had.  I bought a Seido case as it had a locking holster.  I still dislike the extra bulk of the rugged holsters – but they do have their place.

Today my daughter tried to get into my phone and I was notified about it.  Apparently Lookout premium will take a picture with the front-facing camera if someone enters the wrong password 3 times (she thought I was kidding when I showed her the e-mail).  The technology today to know what’s going on with the phone even when you are away from it is fascinating.  It may not help recover the phone – but it does make for an interesting story.  Hopefully you will never lose your phone – but today it can be resolved without too much difficulty.

At the IOD conference this year the volumes of data discussed were amazing.  I’ve been thinking about how much the volumes of storage have grown in my own lifetime.  I remember my first exposures to computers – both at home and at school.  I remember my dad building a personal computer and using a tape recorder (i.e  cassette tape) to store data.  The Apple IIes at school we’re more advanced – as they had the early 5 1/4 inch floppy drives.  I remember one of my first IBM compatible (MS-DOS) PCs that I used for a while – in that it didn’t even have a hard drive initially. It had a 3 1/2 inch floppy drive that you booted from first – and then used a different disk to store data.  Getting my first hard drive was really cool – but I can’t remember how big it was – but it was likely close to 500 megabytes.

Today we’re starting to measure data not in megabytes or gigabytes – but in petabytes and zetabytes (and soon yottabytes).  For understanding here are some basic definitions:

  • kilobyte  - 1024 bytes
  • megabyte - 1000 kilobytes (1,024,000 bytes)
  • gigabyte – 1000 megabytes (1,024,000,000 bytes)
  • petabyte - 1000 gigabytes (1,024,000,000,000 bytes)
  • zetabyte - 1000 petabytes (1,024,000,000,000,000 bytes)

Another way to put this in perspective is to compare these numbers to the storage objects that I’ve encountered in my lifetime.  For example,

  • Remember those 3 1/2 inch disks (see picture above)?  They held 1.4  megabytes – which was a good amount in the early to mid 1990′s when I was in high school and college
  • For backup I had an Iomega Zip Drive – essentially a removable hard drive.  These had a capacity of 100 megabytes - which meant a few of these disks could back up an entire hard drive at the time (which we’re probably around 500 megabytes)
  • CDs typically held about 700 megabytes and we’re in most cases read only.
  • DVDs can hold up to 4.7 gigabytes of data (which at the time seemed a lot).
  • I have an inexpensive USB flash stick – which holds about 4 gigabytes itself (which are now less than $10)
  • You can buy a 2 terabyte hard drive now at MicroCenter for $99.
So how any of these would you need to contain a petabyte of data?
  •  500 2 terabyte hard drives (1000 terabytes / 2 = 500)
  • 212, 766 DVDs (1000 terabytes x 1000 gigabytes / 4.7 gigabytes = 212766)
  • 250,000 4 gigabyte USB sticks (1000 terabytes x 1000 gigabytes / 4 gigabytes = 250,000)
  • 1,428,571 CDs (1000 terabytes x 1000 gigabytes x 1000 megabytes / 700 megabytes = 1428571)
  • 10,000,000 Iomega 100 megabyte ZIP disks (1000 terabytes x 1000 gigabytes x 1000 megabytes / 100 megabytes = 10000000)
  • 694,444,444 3 1/2 inch disk drives (1000 terabytes x 1000 gigabytes x 1000 megabytes / 1.44 megabytes = 694444444)
Another way to think of it is how much video is a petabyte.  My Sony digital camera takes 1080 AVCHD video – which takes up 1.2 megabyte of disk space for every second of video.  So a 1 minute video (60 seconds) uses 72 megabytes of space.  A petabyte is 1,000,000 megabytes so a petabyte can consist of over 13, 800 minutes of that video (231 hours).
So personally I’m astounded by how much more disk space is available now than when I remember.  What’s also amazing is how we can fill up that space (including me).  I had a 500 gigabyte hard drive in my desktop computer (which is really just used to centrally store data) that I thought was quite large when I bought it a few years ago.  Earlier this year I ended up buying a 1 terabyte hard drive to replace it – as pictures and video we’re starting to fill it up.  I personally think it will be an interesting race between how fast we can develop affordable storage vs. how fast we can generate data to fill that up.
For nostalgia purposes here are some pictures of the items mentioned above:

At IOD recently IBM introduced a 4th V to Big Data – Veracity.  Webster’s Dictionary defines “Veracity” as “conformity with truth or fact”.  This dimension of Big Data has less to do with the inherent characteristics of the data – but with how it needs to be used (maybe it hints at the lack of consistency/quality in big data sources).

One of the points made in an elective session was that if “Big Data” is to be used for decision making then it needs to be trusted data by that point.  In a publicly traded company an explanation may be requested as to why a decision was made – and therefore being able to demonstrate the lineage of the data and show it trustworthiness is very important.  It’s one thing to explore data to derive insight (experiment) but it’s another to implement that in a production environment (one which drives decisions).

Therefore once some the phase of initial discovery and experimentation has ended (not that it can’t be ongoing) effort needs to be made to ensure the quality of the data. This may involve techniques from data warehouse world (such as data cleansing) or audits to verify the quality of the data.  While Big Data isn’t the structured data of the relational world effort does need to be made to ensure its quality.  For example, should you ensure that data sources aren’t duplicated in the data your are deriving insight from?  Should you ensure that “spam” isn’t being filtered out successfully – and therefore skewing your results?

Does this possibly slow things down? Yes and no – yes in the fact this does take time – but likely less than figuring out later what went wrong (when the lack of quality of the data leads to a bad decision).  It does mean, like many things done at an enterprise level, that care must be taken to ensure repeatable results that are based on the best data available.  Time will have to be taken to verify that the data is of enough consistency to be useful for analysis.  The predicted results must be compared to actual historical results.

So again the point of Big Data is NOT the technology – but the new insights that can be achieved.  In order for those insights to be reliable the “veracity” of the data must be ensured.

FYI – here are the 4 V’s together:

  • Volume
  • Velocity
  • Variety
  • Veracity

 

Thursday is my last day at IOD – the end of an interesting journey.  I think my mind is close to information overload and I know I’m certainly tired.  I still need to make the journey home – but I expect that to be the easy part.  It will be adjusting back to the real world and digesting this information that will be harder.

In the General Sessions and most of the elective sessions I attended there were some common themes:

  • Big Data is not just about the size of the data
  • Big Data represents an enormous potential for the future
  • Data Governance is an important discipline – even for Big Data
  • Understand who a customer is (across business lines) is a very common issue for businesses [which is what Master Data Management was made to address]
  • We are in a tremendous growth period in the data we create – which much of it “unstructured” [a term I'm still not comfortable with]
  • IBM is poised to help many business with all of these issues (surprise!)
  • We have systems designed on the notion of scarcity – that CPU, disk, and memory are scarce and expensive – and that people and agility are cheap.  The current reality is the opposite – CPU, disk, and memory are cheap but people and agility are expensive

Overall I have enjoyed the conference and found it very well run.  There was a large staff helping through the conference and the Wi-Fi was great.  This was a unique experience for me as IOD is “Big” (to contribute to the overuse of this term).  At first it was a bit overwhelming – especially dealing with the very large crowds here at DST.  As I was here I realized there was more to do than I could possibly do (including Vegas itself).  Frankly Las Vegas is not a town I’m very interested in (just not who I am) but I do like the views of the mountains.

Would I come again if I have the choice?  Yes – I would enjoy it and feel like it would benefit me and my company.  I don’t expect to come again as I see this as special opportunity.  Thank you to everyone who helped make this an enjoyable and valuable experience.

 

IOD More

[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-ddd-iod-2012-biz-partner-day-20121020140705-img_7234.jpg]1960
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-ddd-iod-2012-biz-partner-day-20121020152512-img_7267.jpg]120
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-ddd-iod-2012-biz-partner-day-20121020153003-img_7278.jpg]130
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-ddd-iod-2012-biz-partner-day-20121020162955-img_7368.jpg]140
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-sat-iod-2012-biz-partner-day-img_7484.jpg]150
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-sat-iod-2012-biz-partner-day-img_7533.jpg]90
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-sat-iod-2012-biz-partner-day-img_7535.jpg]90
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-sat-iod-2012-biz-partner-day-img_7601.jpg]100
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121020-sun-iod-2012-biz-partner-day-20121020224903-bj4c7768.jpg]90
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121022-mon-iod-2012-img_8542.jpg]90
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121022-mon-iod-2012-img_8841.jpg]100
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-bj4c1624.jpg]130
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-bj4c1689.jpg]120
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-bj4c1743.jpg]100
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-bj4c1765.jpg]80
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-bj4c1776.jpg]80
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-img_8180.jpg]100
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-img_8187.jpg]70
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-img_8213.jpg]50
[img src=http://anlenterprises.com/wp-content/flagallery/iod-more/thumbs/thumbs_20121024-wed-iod-2012-general-session-img_img_8272.jpg]80

Some interesting facts/quotes from IOD:

  • 90% of the world’s data was created in the last 2 years.
  • 2.7ZB of digital content in 2012 (up 505 from 2011)
  • 68% of IT operating costs in 2013 (up from 29% in 1996) will be for management and administration
  • Only 1 in 5 organizations allocate more than 50% of IT budgets to new projects
  • 30 billion RFID sensors and counting
  • 25+ TB log data every day
  • 76 million smart meters in 2009, 200M by 2014
  • “Data is the new oil. Data is just like crude. It’s valuable, but if unrefined it cannot really be used”
  • 855 security incidents – 174M records
  • 2.2x competitive advantage if you really do better with analytics
  • Notion: There is a scarcity of disk and CPU – so we sacrifice agility and people (which are now more expensive that the disk and CPU)
  • Gartner – by 2014, 85% of currently deployed data warehouses will not scale to meet new information volume and complexity demands
  • U.S. 1st in healthcare spending, 37th in Quality
  • $750B or $0.30 cents of every dollar spent on healthcare is wasted
  • 50% of all incoming calls to a call center are escalated or left unresolved
  • We have for the first time an economy based on a key resource [information] that is not only renewable, but self-generating.  Running out of it is not a problem, but drowning in it is.

I have completed Day 4 at IOD – a Wednesday.  Frankly it already feels like a week has passed – not half a week.  I feel like I’m finally understanding how to get around this place – the day before I leave.  Today was a nice day – starting with a great breakfast.  In addition to a number of session (including the General Session) I took a deeper dive into an IBM tool around Reference Data.

Again- my sessions have been revolving around Big Data and Information Governance / Master Data Management.  I’m not a DBA – but a data architect – so I’m interested in understand the value of data.   I feel like after all these sessions my head is swimming with data – especially a lot of quotes.   I may add another post of some interesting facts that have been mentioned in all the sessions.

Overall I have enjoyed being at IOD – as it’s quite an interesting experience.  I realized the EXPO is already over – before I had a chance to go through it all (even though I did enter a number of iPad contests).    I have enjoyed meeting a great number of people (especially enjoyed those from other countries) along with the learning opportunities.  It’s a fast pace – one where you’re head is often left spinning – but an enjoyable experience.

This blog post is shorter as I’m quite tired (still dealing with the time change).  I’m grateful for my chance to attend IOD and chance to share my thoughts with you.

I'm a proud member of the WordPress Users Association, Become a member today
Get Adobe Flash player