My Photo

Your email address:


Powered by FeedBlitz

June 2009

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30        
Blog powered by TypePad

June 09, 2009

Prediction: Channel Consolidation

As we live life, our actions are recorded across countless channels, e.g., text messaging threads versus ATM transactions and so on.  Channel separation is why your bank doesn’t know where you were physically located yesterday and your doctor doesn’t know the contents of your work emails.  While we take channel separation as a given, channel consolidation is the trend and our society is heading in this direction at warp speed.

Channel consolidation is an essential ingredient to improving accuracy in prediction (e.g., your on-line retailer wisely inferring your interests).  For the most part, consumers love this as it makes life more efficient.  Businesses equally like this as better prediction means more efficient operations.  And for all these same reasons the public sector (intelligence and law enforcement to social services) are just as keen to enjoy improvements in prediction as well.

Facebook makes for a great example of channel consolidation.  All your emails, instant messages, status updates, past/present/and future travel, annotated photos, your social circle, memberships, self–expressed interests, and more … all bundled together in one nice little package, under your user account.  Traditionally such life details are expressed on diverse channels – unobservable to any single entity.  No more.  Facebook, with this panoramic view of its users, now likely has a substantially more complete picture of a person than almost any other single entity.

How powerful is this?  Here is one example: if you are a Facebook user maybe you have noticed the increasingly (spooky smart) relevant ads.  I get ads that read “Are you 44, a triathlete, and want abs like this?”  Or a well-timed ad over the summer when I was in Southern California that read: “Are you looking for a triathlete coach in the Orange County area?”  It is so relevant I find it very hard not to click on the ad!  (Be assured I do resist.)

The more sense Facebook makes of users, the better the service, the more folks will find Facebook irreplaceable, the more users will flock to the platform, and last but not least, the more advertisers are willing to pay.  Everyone seems the winner.

Hence, channel consolidation is inevitable primarily because it is irresistible.  [More here]

Consumers actually demand this. For instance, you expect that your healthcare provider will channel consolidate your data (lab work, prescription history, etc.) to properly care for you – or you may sue them for negligence! 

Nonetheless, it takes no leap to realize this very big and very important question: ‘who consolidates which channels and for what purpose?’

Law and policy will inevitably determine which entities can access and commingle which data (channel consolidation) and under what condition.  At the same time, I worry that the technical means to enhance privacy (e.g., Immutable Audit Logs that facilitate accountability and oversight) are not being adopted at an appropriate pace to keep up.

One more interesting tidbit: People often use and then abandon email accounts.  And I bet most of these folks consider all those communications (e.g., associated blog comments) effectively clipped off – like a tail – and no longer of record.  However, if the data lives on, and if there are features in that data that enable channel consolidation (e.g., your name and one or more additional distinguishing features) … then it is quite possible that these bodies could be raised from the dead.  Hummm…

How to prevent channel consolidation and the resurrection of your clipped tails makes for interesting conversation – but that will have to wait.

And on the lighter side: Facebook, by the way, makes use of a fraction of what they know and uses basic algorithms at best.  I realized this when not long ago I commented to my girlfriend about how damn useful the Facebook ads are getting and she pointed out that one of her recent ads stated “Is your boyfriend gay?”  What the hell!  And then a few weeks later I get an ad that says: “Is your girlfriend cheating on you in Vegas?” 

Touché!

The less likely alternative is that Facebook is using most of the data and very smart algorithms … so smart in fact that they have an advertiser intentionally making us suspicious of each other with the intent of soon dishing up a new ad that says something like “Need a private investigator to watch your girlfriend?”

Bastards!

RELATED POSTS:

Six Ticks till Midnight: One Plausible Journey from Here to a Total Surveillance Society

More Data is Better, Proceed With Caution

Puzzling: How Observations Are Accumulated Into Context

Trust Has a Half-Life

April 20, 2009

Ironman Australia, 2009 – Misplaced Delusions of Grandeur

Port Macquarie, Australia is a very beautiful place.  The race is well run and the course is excellent – albeit a bit hilly.

The funniest thing happened on the bike.  As I have mentioned before (here) performing math functions in the head while in these races is nearly impossible – even single digit addition.  So get this: I am feeling real strong on the first of three bike loops.  Almost no one has passed me and I am flying.  I look down at my bike computer and see that I am moving along at 22mph at that moment … but have averaged just over 24mph so far!  I was giddy.  I contemplated what it would be like to finish the 112 mile bike leg of the race in well under five hours and how even the pros behind me would be miffed at this performance.  A few minutes later, my speed is still hovering around 22mph yet my average speed was up around 25mph.  How this happened I was not sure … but so it must be as the computer does not lie.  So I continue to quietly chuckle to myself about my extraordinary power.  Soon I see my average speed is almost 27mph.  Now this caused me pause.  It was like a riddle.  How could this be?  Wait a minute … I look closer at my computer and see that this incredible average speed was really “total distance travelled!”  What an idiot.  I flip the computer display over to average speed to find out my real average speed … 19.5 mph!  Ughhh.  This, I must admit, broke my will.  The next 2 hours I coasted a lot and averaged about 16mph.  What a loser.

Here a new swim tip for hackers: When you are near the back of the swim you will find some people swimming the breast stroke.  Remember to avoid these folks.  Their frog-style kick can deliver exceptional beatings to your head.  I was reminded about this 2-3 times in the first 20 minutes of this race.  As I reflected on this over the next few miles of swimming, I was quite sure this was karma caused by my inappropriate behavior during the UK Ironman (ref: Kick to Kill).

On the subject of deadly: While driving around we discovered a giant spider (at least 3” leg span) climbing down my window inside the car.  Knowing Australia has lots of deadly critters, I was quite panicked.  The good news is … I was able to use the map to throw the spider out the window.  The bad news is … I threw the map out the window.  Michelle says “Needed map!”

Actual Important Things If You Are Going To This Race

1. Make sure bike has no dirt on it when you land in Australia.  They will check your bike tires at Customs at the airport to see if you are bringing dirt in.  Getting your bike quarantined would be bad.

2. Be prepared for hills on the bike, humidity, and maybe some rain.

3. I recommend you get in the water and swim a bit up the river before race day.  There can be strong currents.  During my test swim (that made my arms sore because I had not swam in months) it took 45 minutes to swim up the river and 15 minutes to swim back down.  Had I not tested the water, this might have spooked me race day.

4. The road is a bit rough in places on the bike course.  And due to the rain, there were a number of folks with flats.  So I swapped out my tires for thicker ones and rode with 110 PSI and had no problem at all.

5. I learned to drive on the left side of the road for the first time.  Although this had always been a fear, this is a great place to learn.

6. Watch for spiders and get two maps.

 

RELATED POSTS:

Hacking the 2008 UK Ironman: Kick to Kill

Malaysia Ironman versus South Africa Ironman – “Tastes like Mango”

Hacking the 2007 Brazil Ironman Triathlon in Florianopolis (May 27, 2007) – Strategy, Tragedy and 100% Pure Agava Tequila

Preparing for the 2007 New Zealand Ironman in Singapore?

Handicapped at the 2006 Arizona Ironman

Surviving the 2006 France Ironman and How Intelligent are Chimpanzees?

Dumb and Dumber: Consequences of the 2006 Silverman Triathlon

What sharks? Reflections on the 2005 Western Australia Ironman

March 16, 2009

Nation At Risk: Policy Makers Need Better Information to Protect the Country

Last Tuesday, March 10, The Markle Foundation Task Force on National Security in the Information Age released a report titled: Nation At Risk: Policy Makers Need Better Information to Protect the Country. (PDF here)

Members of the Task Force who prepared this report included: William Crowell, Bryan Cunningham, Jim Dempsey, John Gordon, Slade Gorton, Jeff Jonas (me!), Judith Miller, Jeffrey Smith, Abraham Sofaer, Rick White, and Richard Wilhelm.

We made the following five recommendations, calling for the accelerated creation of an information sharing framework:

1. Reaffirm Information Sharing as a Top Priority

2. Make Government Information Discoverable and Accessible to Authorized Users by Increasing the Use of Commercially Available Off-the-Shelf Technology

3. Enhance Security and Privacy Protections to Match the Increased Power of Shared Information

4. Transform the Information Sharing Culture with Metrics and Incentives

5. Empower Users to Drive Information Sharing by Forming Communities of Interest

The report is relatively short and to the point, just 27 pages long. It includes a handy four page appendix summarizing all of the recommendations (pages 22-25).

Here are a few elements I would like to bring to the attention of my readers:

Recommendation 2 speaks to "Discoverability." Our Task Force built on our earlier recommendations (in prior reports) to use data indices – much like the card catalog at the library. Using indices, users can locate data in the enterprise and if qualified for authorized use, they can limit access to the records of relevance. Notably, this model means less data is being transferred around, therefore less data must be kept in synch. The risk of unintended disclosure is mitigated to a degree because fewer copies of data are being made. In short, indices allow users to locate just what they need and no more.

In recommendation 3, the Task Force makes a number of specific security and privacy recommendations. One of my favorite examples is:

"… including implementation of real-time audits of user compliance and behavior and immutable audit logs that record how a system has been used …"

These Immutable Audit Logs are clever little inventions that allow oversight and accountability groups to see exactly how a system has been used. Even a system administrator cannot secretly change the past by altering the log. Imaging every time someone peeks into the card catalog whether they find something, or not, this is recorded in an indelible manner. What cards they saw, what books they looked at, and so on … all accounted for with no way to hide the facts.

The use of the term "metrics" in recommendation 4 is also worth special mention. One such metric the Task Force would like to see is reporting what percent of a system’s information is discoverable i.e., percentage of records that have corresponding cards in the card catalog. Case in point; what is the value of a book on the shelf at the library if there is no related card in the catalog?

Finally, as indices are going to be the way information sharing gets accomplished … in my opinion, the essential policy debate must immediately begin considering the following:

1. How many indices? There are benefits for fewer. And there are different benefits for more. One index for law enforcement and one for intelligence? Or, one index for foreign collections and another for open source? One index, 20 indices, or 100?

2. Where will these indices physically reside? At Google? (Note: Google is an index already used for open source).

3. Which key attributes will be placed in the index? While the library uses subject, title and author … maybe one index will need to contain who, what, where, when.

4. How much latency between data changes in source systems and their corresponding change in the index? For example, if a watch list record is deleted in a source system, what is the maximum amount of time one would want to wait until the same record is redacted from the card catalog?

5. When a user searches the card catalog, who do you notify when an index card is found?

a. Only the inquirer?

b. Only the owner?

c. Both?

d. Neither, only a third party is notified?

e. No one can be told?

6. When there is any notification, what information is revealed about the index card to the inquirer and/or data owner?

a. All the attributes related to the index card?

b. Some of the other attributes on the card (e.g., search author and see title)?

c. No other attributes are revealed?

d. The custodian organization of the data record?

e. The custodian source system of the record?

f. The actual record number used to identify this piece of data in the source system?

g. The user, if any, associated with this record, e.g., the analyst’s name and phone number?

7. When there is a notification to a data owner, what information is revealed about the inquirer?

a. All the attributes related to the search?

b. Some of the other attributes on the search?

c. No other attributes are revealed?

d. The inquirer’s organization?

e. The inquirer’s source system?

f. The actual session number used to initiate the inquirer’s search?

g. The inquirer’s name and phone number?

8. What user audit standards and processes will be required to ensure the system is being used in accordance with law and policy?

9. What metrics will be kept and who can see which metrics?


RELATED POSTS:

Discoverability: The First Information Sharing Principle

Information Sharing: Got Directory?

No Need to "Over Share" – Thoughts on Information Sharing

It’s All About the Librarian! New Paradigms in Enterprise Discovery and Awareness

Federated Discovery vs. Persistent Context – Enterprise Intelligence Requires the Later

Immutable Audit Logs (IAL’s)

Found: An Immutable Audit Log

Full Attribution, Don’t Leave Home Without It

Out-bound Record-level Accountability in Information Sharing Systems

Data Tethering: Managing the Echo

To Anonymize or Not Anonymize, That is the Question