1) Meetup – http://www.meetup.com/Data-Mining/events/194151752/

2) Video/Slides – Not available

Summary – 

1) TellApart is mainly into ad personalization for retail companies (ex. nordstorm). They have a large nice office. The main difference in their model is people have to click AND buy something – they operate on revenue sharing

2) One half of the talk was about lambda architecture. It is basically a big data design pattern where a datum needs to be acted upon immediately (streaming/real-time) and also in more elaborate manner later on (batch). Stupid example using music recommendation – if a person listens to only melancholic piano music but suddenly likes a grunge metal track, the immediate recommendation should be more grunge metal tracks, but later on (few mins/hours when the batch processing system has processed this datum) recommendation should include some heavy metal (or whatever goes along with melancholy piano and grunge metal). 

3) Two major models out there for lambda architecture – hadoop(batch) + storm(realtime) and spark(batch) – spark streaming (realtime). There was no clear contrast online so I asked some people there about their experiences. Quote – “A company has a storm stream pipeline, a redundant storm stream pipeline for failover, if even that fails – page engineers”. Not very inspiring.

4) The remaining part of talk was about ad placements – the math and strategy. Discussions about optimization strategy (nash equilibrium, etc), response time (a decision about placing ad has to be taken within 100ms, spark stack – 40ms), ML issues (cold start problem, models per user vs models per feature-set, modeling the competition) 

Advertisements