This is an experiment in live blogging, so beware ;). My immediate impressions: the focused on advertising systems was interesting. Large-scale analysis was a focus of many talks and (as one would expect), the answer was always to use a map-reduce-style infrastructure.
Attributor: revenue debugging in advertising systems
- Advertising systems are complex. Composed of users, publishers (bing.com), lots of services (publisher interfaces, auctioning algorithms ,etc.), and advertising systems
- Want to answer why is revenue/revenue-per-search down anomalously?
- Need to detect that the anomaly is occurring and then the root cause? Talk focuses on root-cause analysis
- Some reasons for root-cause analysis:
- Datacenter in Dublin had latency issues that resulted in fewer ads being served
- More people searched for non-monetizable terms (papal election!)
- Contributions: novel algorithms for root cause analysis in ad systems, attribution for derived measures, the tool itself.
Data used & algorithm
- Presenter says aggregate performance counters must be used because of sheer scale of ad systems. Not sure I buy this, since there’s been lots of work on getting detailed data at scale (Dapper, X-Trace, etc.)
- Aggregate performance counters can include fundamental metrics and ‘derived metrics.’ There are lots of metrics
- Algorithm basics: root cause chosen should explain as much of the change possible and should be as simple as possible.
- When choosing between multiple root causes, look at ‘surprise’ by looking at entropy between actual and expected result.
- Multi-objective optimization problem that tries to maximize suprise
- There is something interesting about attributing root causes for derived measures, but I didn’t quite get the intuition. Will need to look at paper. I wish the presenter had spent more time on this…it seems like it’s a key element of the talk.
- Presenter points out that tool identified the right root cause in some cases in which operators performing a manual analysis identified an incorrect root cause
- How would you generalize this to more markets? Answer: the evaluation is on four markets, but it could be used on more.
- Can the data collected by the tool be used to identify ways to maximize advertisers’ revenue? Answer: There are lots of tools that can be created.
- Can you apply this system to regular online services? Answer: tool is not necessarily specific to advertising. Need experience applying it to different systems before making any definitive statements
DECAF: Detecting and Characterizing Ad Fraud in Mobile Apps
- Ad network (microsoft networking) provides an ad plugin, which mobile App developer includes in his app. Ad plugin displays ad based on users’ preferred interests, etc.
- Ad network pays developer based on number of ads delivered or click throughs
- Develops have incentive to commit fraud by inflating clicks and views. This is a big problem due to the size of the ad market.
- 1 billion dollars lost due to ad fraud in 2013 (wow…I wish some of that money were going my way!)
- Ways to create ad fraud: manipulate visual layout to create invisible impressions or unintentional clicks. For example, developer can include too many ads in a single screen. Valid things the user might want to click on might ‘cover up’ an ad, forcing the user to mistakenly click on the ad. (Are there more??)
- Want to create automated system to detecting ad fraud
Challenges/overview for creating an automated system
- Challenge 1: Scale. Need to handle many visually complex apps. A single page of an application to contains hundreds of clickable elements.
- Challenge 2: accurately and quickly identify fraud. Screen may contain many ads, but only a subset may be visible at a time. Need to identify if any visible subset violated ad provider’s rules on number of ads that can be displayed at a time. This is called the ‘sliding window’ problem.
Approach & tool (DECAF)
- DEAF includes an automated UI navigation layer, which feeds visible pages into a fraud checker that looks for different Ad classes
- UI navigator extracts UI structure (what can be clicked or swiped or scrolled and what happens next — this is a tree). UI navigator performs each action, and feeds result to the checker. Search space can be huge.
- How to reduce search space? DECAF uses observation that structurally similar pages need not be revisited even if content differs.
- To determine structural similarity, a feature vector is used that encodes level and type of structural element. Is there a graph isomorphism problem.
- The fraud checker checks for “many ads in one display screen,” “the sliding screen” problem (and perhaps more)
- Ran DECAF on many applications for 20 minutes and detect how prevalent different fraud types are. They also compare a basic UI navigator to one with optimizations (avoid similar structures, etc.)
- Found that small number of developers are responsible for a lot of ad fraud
- Can the UI extract all types of actions? What about custom actions? Answer: Can’t extract all of them
- How will application developers circumvent DECAF? Answer: The checker can always be improved, so it’s a cat and mouse game. Person asking question: Worried more about Monkey – can developers hide ads better?
- Do you need to use structural similarity? Can you just use page class? Answer: You have to define similarity some way. Answer: [I didn’t quite get this]
I know what your packet did last hop
- Presenter starts out with a few examples of network problems.
- States existing tools (tcpdump, ping, trace route) aren’t sufficient. They really don’t give that much visibility into the network — want to see information about every event that ever happened to every packet.
Approach & tool (NetSight)
- Obtain packet history (switches/routers a packet was routed through, forwarding state of those routers, etc.)
- NetSight adds rules to SDN switches that tells it to forward postcards of packets to a special repository. postcard includes header, switch id, output port, and ‘version of flow table.’
- Need to reconstruct packet paths (packet histories) by analyzing postcard headers.
- Netsight allows for querying on packet histories. Some applications built on top of it: nib (like gdb for networks), net watch (live network invariant monitor), nprof (hierarchical network profiler), netshark (??)
- Each postcard is 64 bytes and Stanford network is 5 hops w/1031 byte packets on average. Naive implementation can result in 31% network overhead
- Their answer for collecting data at scale is to use a map-reduce style architecture.
- HostAssist where hpervisor can insert an unique ID into a packet on the source allows packet postcards to avoid including the header (as the hypervisor can store the header separately with the same ID). Reduces network overhead from 31% to 7%.
- If you use IDs to reduce postcard header size by omitting the the header from postcards, how do you deal with when malicious switches strip the ID? Answer: this has to be taken care of somehow — for example, the proxy on the switches has to guarantee it.
- Question: How do you use this information? Answer: forwarding loops, reachability, etc. Follow up: Are there standard queries? Answer: Have the power of regular expressions.
Libra: Divide and Conquer to verify forwarding tables in huge networks
- Data plane problems in dat centers are common — missing forwarding rule, loops, and blackholes happen frequently at Google.
- Getting forwarding right is hard because of complex interactions between protocols and different states indifferent switches. Also, writers of forwarding rules usually don’t coordinate with each other.
- One approach to deal with this: static data-plane verification — tells you whether loops or blackholes can occur However, this doesn’t tell you why these incorrect rules were installed (this would require debugging the control plane).
- Data-plane verification is hard because rules change often (perhaps due to failures, etc.), so periodic snapshots of forwarding table rules may be inconsistent.
Approach to avoiding inconsistent forwarding table states
- Only capture forwarding table state when network is stable (no new forwarding rules are being sent or being installed). Authors claim that 99.9% of the time, network is stable, (Where does this data come from? I think Google…)
How to scalable verify forwarding table state?
- 10,000 switches, 1000s of rules, lots of destinations…
- Forwarding graph: How networks routes packet when routing to destination. Changing forwarding rules amounts to adding and deleting edges from this graph.
- Forwarding graph size can help deal with scale. Each forwarding graph contains only relevant subset of rules for forwarding to a given destination. These forwarding graphs can be verified independently
- Map-reduce-like framework for capturing forwarding state, matching forwarding graphs to relevant rules, and determining if forwarding graphs match the rules
- Libra framework for evaluating forwarding-table state captures forwarding table state, matches it to relevant rules