Harðkjarni: thoughts

Showing posts with label thoughts. Show all posts

19 August 2014

My ongoing list

My totally unoriginal list of things I'm still learning:

It is easier to ask for forgiveness than permission. Just do it!
It's better to fix problems than trying to prevent errors. So try to fail as fast and as early as you possibly can and learn from it.
It's not your baby, so leave your ego at the door. Your work is not "yours" but your employer's. So as soon as it's been typed, it can and should be criticized and worked on by anyone.
Everyone does things slightly differently. If things are correct, avoid judging other peoples work purely on style or differences from how "you would do it".

26 May 2014

The Extra Effort

“Perfection is achieved, not when there is nothing more to add,

but when there is nothing left to take away.”

– Antoine de Saint-Exupery

There is an interesting research published recently in Social Psychological and Personality Science, "Worth Keeping but Not Exceeding: Asymmetric Consequences of Breaking Versus Exceeding Promises".

It describes research into how promises and contracts are viewed by people, especially how their completion is perceived and how people are likely to react to under- and over-delivering on promises as compared to delivering on exactly what was agreed upon.

"Businesses may work hard to exceed their promises to customers or employees, but our research suggests that this hard work may not produce the desired consequences beyond those obtained by simply keeping promises. [...] The results of our experiments suggest that it is wise to invest effort in keeping a promise because breaking it can be costly, but it may be unwise to invest additional effort to exceed one’s promises. When companies, friends, or coworkers put forth the effort to keep a promise, their effort is likely to be rewarded. But when they expend extra effort in order to exceed those promises, their effort appears likely to be overlooked."

This somewhat confirms what I have experienced during my career when it comes to delivering software. It is the single most important thing to deliver what was originally agreed upon in a working state. A solution with the minimal level of features that each does only exactly what was agreed upon is in most cases much better received than a solution that has a bunch of bells and whistles even though both solutions deliver equally reliable software on the exact same promise.

Pro-Tip

Deliver only what was agreed on and at the right time. Nothing more, nothing less. Never mention any bonus features but consider saving any that may have been implemented as a free surprise a few weeks after a successful delivery on the initial contract.

Publication information:

Worth Keeping but Not Exceeding: Asymmetric Consequences of Breaking Versus Exceeding Promises Ayelet Gneezy and Nicholas Epley
Social Psychological and Personality Science published online 8 May 2014
DOI: 10.1177/1948550614533134

Abstract
Promises are social contracts that can be broken, kept, or exceeded. Breaking one’s promise is evaluated more negatively than keeping one’s promise. Does expending more effort to exceed a promise lead to equivalently more positive evaluations? Although linear in their outcomes, we expected an asymmetry in evaluations of broken, kept, and exceeded promises. Whereas breaking one’s promise is obviously negative compared to keeping a promise, we predicted that exceeding one’s promise would not be evaluated more positively than merely keeping a promise. Three sets of experiments involving hypothetical, recalled, and actual promises support these predictions. A final experiment suggests this asymmetry comes from overvaluing kept promises rather than undervaluing exceeded promises. We suggest this pattern may reflect a general tendency in social systems to discourage selfishness and reward cooperation. Breaking one’s promise is costly, but exceeding it does not appear worth the effort.

20 February 2014

Failure resiliency and retry delay planning

When dealing with networked or external data sources I've learned the hard way that all code should be designed as to expect failures. It is significantly cheaper and easier to bake a graceful handling of errors this from the start rather than attempt to do it later on. A common first step in combating failures and providing resiliency is to have your logic retry an operation in case of a failure.

But how should you retry?

Simple retries

The most simplistic retry mode is to simply surround all your code with a while loop that executes your block a predefined number of times, ala:

int retry = 0;

do
{
   // Operation
   if( true == MyFlakyOperation() )

       break;
}

while ( ++retry < 6 )

The problem with this approach is that it completely ignores the most likely underlying reason for the failure. Congestion or resource load on the remote end could be causing your calls (and many others) to intermittently fail as the server cannot handle the incoming requests. In this case your naive implementation might actually be contributing to making the situation even worse.

So how do we solve this?

Spacing out retries

One common approach to spacing out retries is called Exponential backoff. This algorithm uses a predefined feedback (e.g. retry count) to systematically increase wait times between repeated executions of the same code to avoid congestion.

Example of exponential spacing based on 4sec base wait time.
The vertical bars indicate retry points.

The idea is that with every additional retry that is required it is more likely that the system we're communicating with is heavily congested and needs more "breathing" space to resolve its current backlog of requests.

Example of backoff retries

Below is an example of a very simple C++ algorithm snippet that performs this kind of exponential backoff based on 4sec intervals:

int success = OK;
int retry = 0;
do
{
   // Operation
   success = MyFlakyOperation();

   // Sleep if operation was not success
   if (success != OK)
   {
       int sec = static_cast(std::pow(4, retry));
       std::this_thread::sleep_for(std::chrono::seconds(sec));
   } 
}

while ( ++retry < 6 && success != OK)

In this example my algorithm has a maximum running time with full retry count of a whooping 22 min and 44 seconds! (4+16+64+256+1024 = 1364sec).

How much does the waiting time increase?

Care must be taken when choosing the interval to increment by when using a naive approach as my example above. Below is a table listing the waiting times in seconds for each retry for 2-7 second intervals.

Remember that your maximum running time is the cumulative waiting numbers for all intervals!

Retry#	2sec	3sec	4sec	5sec	6sec	7sec
1	2	3	4	5	6	7
2	4	9	16	25	36	49
3	8	27	64	125	216	343
4	16	81	256	625	1,296	2,401
5	32	243	1,024	3,125	7,776	16,807
6	64	729	4,096	15,625	46,656	117,649
7	128	2,187	16,384	78,125	279,936	823,543
8	256	6,561	65,536	390,625	1,679,616	5,764,801
9	512	19,683	262,144	1,953,125	10,077,696	40,353,607
10	1,024	59,049	1,048,576	9,765,625	60,466,176	282,475,249

* so using 7 sec as a base and allowing up to 10 retries, the total maximum waiting time will be just shy of 10,5 years!

13 April 2013

Is my mobile app really free when it is serving ads?

I use this rather handy app, BaconReader from http://onelouder.com/ to browse Reddit.com.

I connected my tablet to my computer and turned on LogCat as I was about to start debugging my own little app. Instead two little log lines caught my eye:

D/baconreader(25507): Message Service Started
D/baconreader(25507): Checking for messages

That was something I thought shouldn't be running at all, I launched the BaconReader app to double-check that all settings related to notification and background activity for that app were indeed turned off or set to manual (I do this in effort to conserve battery power). They were all off but still the app was periodically starting up a service to check for messages (thanks for that).

While doing this I glimpsed how much excess network activity the application was performing while I was under the impression that it was "idle" (i.e. I wasn't interacting with it).

Usage and User Profiling

The application is using what seems two different platforms to track users and usage patterns Flurry and Google Analytics. Fair enough, nothing too fishy about that I guess. Both services seemed to communicate very sparingly with the server with only an initial message sent when the application started up and then stayed silent while I let the app sit "idle".

I have no problems with apps that serve ads in exchange for me being able to use them for free. What I found curious was that this app seems to be contacting three different ad services with varying levels of details about both me and my device.

Millennial Media
This component, initiated a HTTP request twice every second, each time with a very verbose URL. Below is an example of the query parameters for one request:

accelerometer:true
adtype:MMBannerAdTop
ar:manual
bl:46
cachedvideo:false
conn:wifi
country:GB
density:2.0
dm:Nexus 10
dv:Android4.2.2
hdid:mmh_94888091071502DC8F18CE0A08CBFA79_6AC9AD8BDA218890306A1DAF963D603E089F82C9
height:53
hpx:2464
hsht:53
hswd:320
language:en
loc:false
mmdid:mmh_94888091071502DC8F18CE0A08CBFA79_6AC9AD8BDA218890306A1DAF963D603E089F82C9
mmisdk:4.6.0-12.07.16.a
pkid:com.onelouder.baconreader
pknm:BaconReader
plugged:true
reqtype:getad
sdkapid:62626
space:16810389504
ua:Mozilla/5.0 (Linux; U; Android 4.2.2; en-gb; Nexus 10 Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30Nexus 10
video:true
width:320
wpx:1

I'd love to know the reason why they need to know why my device is plugged in or not, how much space is free on it and what connection method I'm using to connect to the internet. Seems a little much but still nothing surprising and unexpected from ad requests. The standard tracking-cookie for my device is even included in the hdid variable (I assume) but no GPS information was transmitted (even though GPS was enabled for apps on my tablet).

Google AdMob

Pretty standard AdMob (old doubleclick.net) requests really. Nothing unexpected here, they even try to minimise the amount of characters traveling over the network. How nice of them! Still they sent two requests every second. Below is an example of their querystring:

preqs:0
session_id:6944941379333807684
u_sd:2
seq_num:1
u_w:800
msid:com.onelouder.baconreader
js:afma-sdk-a-v6.2.1
mv:8016014.com.android.vending
isu:94888091071502DC8F18CE0A08CBFA79
cipa:1
bas_off:0
format:320x50_mb
oar:0
net:wi
app_name:38.android.com.onelouder.baconreader
hl:en
gnt:0
u_h:1232
carrier:
bas_on:0
ptime:0
u_audio:1
u_so:p
output:html
region:mobile_app
u_tz:60
client_sdk:1
ex:1
slotname:a14eb80d44ba957
caps:inlineVideo_interactiveVideo_mraid1_th_autoplay_mediation_sdkAdmobApiForAds_di
jsv:46

Ad Marvel

This by far was the worst offender. AdMarvel was a small mobile ad startup that was recently (in 2010) bought by Opera Software. The device performed 4 requests every second (where two requests were identical except for one querystring variable had changed, that is retrynum had been incremented from 0 to 1). Below is an example of one of their query string:

site_id:23206
partner_id:7b862efbe6c75952
timeout:5000
version:1.5
language:java
format:android
sdk_version:2.3.7.1
sdk_version_date:2013-02-04
sdk_supported:_admob_millennial_amazon
device_model:Nexus 10
device_name:JDQ39
device_systemversion:4.2.2
retrynum:0
excluded_banners:
device_orientation:portrait
device_connectivity:wifi
resolution_width:1600
max_image_width:1600
resolution_height:2464
max_image_height:2464
device_density:2.0
device_os:Android
adtype:banner
device_details:brand:google,model:Nexus 10,width:1600,height:2464,os:4.2.2,ua:Mozilla/5.0 (Linux; U; Android 4.2.2; en-gb; Nexus 10 Build/JDQ39) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2
hardware_accelerated:true
target_params:UNIQUE_ID=>f780070d9537897a||GEOLOCATION=>59.4954954894955%2C-0.99682319125740403||bucket=>4||subreditname=>Front+Page||RTBID=>FBATTRID%3Ae939e527-35db-4e98-8d73-e05fead999b1||APP_VERSION=>2.8.1||RESPONSE_TYPE=>xml_with_xhtml||BNG=>0

What I find worrying here is (besides the size of the request) their final target_params variable which not only includes the ID for my device and exactly what part of the application I am viewing but also a pretty accurate GPS location (disabling GPS access for all apps got rid of that) and what looks to be Facebook related FBATTRID tag to serve ads from Facebook's MoPub Marketplace.

Talk about being tracked between devices.

Conclusion

All of this activity was happening multiple times a second for all three advertising services on my device. So six times a second my device was communicating tracking information and requesting advertisements back.

One of the most battery draining activities is transmitting data through the air on a mobile device. Based on these three services it seems like the industry standard is to query twice every second which to me feels overly aggressive and costly on my battery charge.

So what are we really getting for free? How much money are we spending on electricity to power our devices that is then used directly to serve us advertisements?

Perhaps I should just buy the thing, probably cheaper in the long run!