Video Emails Triple Click Through Rates

One of the most important metrics for successful email campaigns in the click to open rate. Getting messages into the inbox is important, but you still need to get the recipient to open the email and then take action based upon it. The use of useful images has always generated a higher click rate, but now studies are showing that using video triples the click through rate of email campaigns.

Implix recommended that to embed video, include a thumbnail in the email with a clear play button, which invites people to click over to the website to view the video.

Twitter Reaching Minority Internet Users

If you aren’t in an urban area, are slightly older, and white, you probably think that Twitter is the most over-hyped thing on the Internet. You may know some people that use it, but it seems silly and pointless. You use your phone to call/text people, Facebook is still the leading social media engine where you swap pictures of your kids with old classmates, etc.

But Twitter has had tremendous growth amongst the non-white population. According to Pew Research, “Twitter Update 2011“, while only 9% of White Internet users use Twitter, 25% of Black Internet users and 19% of Hispanic Internet users use TWitter, and half of all Twitter users use Twitter on their phone.

If your product/service needs to reach minority Internet users or young Internet users, you need to have a Twitter strategy.

The Spanish Language Web

South and Central America are home to large and growing populations, close proximity to the United States and Canada, and often generous or free trade pacts with the United States, bringing business interaction as well as increased immigation. With a growing Spanish speaking population in America, you would think that there would be massive growth in the Spanish language web. According to Wikipedia, Spanish Language in the United States,” 35.5 Million Americans over the age of 5 speak Spanish as their primary language, roughly half of whom speak English “very well.”

So a little over 5% of the American population speaks Spanish and very little English, and a little over 5% of the American population speaks Spanish natively.

When Apple was < 5% of the market, people worried about losing 5% market share by not accommodating them. By not offering your website in Spanish, you are losing 5% of the potential market that doesn't speak English well AND 5% of the potential market that speaks Spanish better. As a result of neglect, the keywords are substantially less competitive in both paid and free search, yet they are still neglected.

What are you doing to accommodate this portion of America?

High Performance Writes with Triggers

September 7, 2010 Leave a comment

So a high performance system I was working on encountered a small technical problem, under heavy and repeated load from a single data source resulted in the Apache threads colliding, timing out, and resulting in the load balancer returning a 503 Error.  After much frustration, it was finally tracked down to this process.  When a data record enters the system, it is recorded, counters are interated, and as it moves to the monetization systems, each transmission is logged.  Each transmission also has counters updated via trigger, all of which presents a tremendous amount of data as simple lookups or small aggregate data.  Knowing how many records come in from a company on 5 tracking codes over 3 months means an aggregate query that sums 15 lines (the monthly total for each tracking code), but this means a tremendous number of UPDATE set count = count+1 triggers.  When a second event takes place before the first one is completed, a deadlock can occur as multiple transactions attempt to access the same row for updating.

The solution?  An holding table without triggers.

Another table for transmissions was created, that lacks any of the counter code that causes the deadlock.  When a transmission is logged, it is immediately recorded in this simple table, which returns control back to the system.  A background daemon runs that calls a single stored procedure to migrate the data:

  1. Record the id of all transmission records in this table (obviously, the database should handle this so changes in other transactions don’t interfere, but doing this explicitly makes it more portable and has a negligible impact on the process.
  2. As a single insert, bring all these records from the holding table to the permanent table.
  3. Delete these records from the holding table.

As a result of this process, the deadlock condition is resolved, and this procedure takes a fraction of the time because all the updates can be processed in the transaction without conflict.

The only drawback, the system’s counters and transmission log is only accurate to the last time it ran.  To make this as minimal as possible an intrusion, a daemon was written in bash, see this tutorial for a starting point, to simply run the stored procedure, sleep, and run again.  To protect yourself from potential race conditions, which result in deadlocking, do three things to avoid trouble:

  1. Use the same sequence for the primary key field in the temp table as the real one.  That way, if you attempt to copy it a second time, it will fail because the ids are in use (this also lets me to bulk adds to the table from a data pull without wasting time in the temp table).
  2. Run the function as a dedicated user.  Have the function run with definer permissions, and create a dedicated user to run the script, capped at a single login.
  3. Run the daemon, not a cron job.  If something delays the scripts, and you cron job it every minute, then you could start a second copy before the first is completed.  If you write the daemon, then the second job doesn’t run until the first is done.

When using a holding table, make certain that you are simply trying to let the triggers run after control has returned to the client function.  In a less high performance task, a simple LISTEN/NOTIFY structure instead of triggers will get the job done with less complexity.

Caveat, referential integrity can be compromised here if you are not careful.  If things are dependent on this table, this approach will not work, as the records will be recorded in the system, but not passed to the table where foreign keys reference.  Engineering around that limitation may create tremendous complexity.

Alternative Approach to explore: partial replication to a second server.  In that scenario, all the control code (meta code) could replicate from the repository to the secondary server(s).  Each of those could hold temporary insert tables.  In that case, as you add servers, make the sequences on each server count by multiples (two servers, one uses evens for ids, one uses odds).  Then you can replicate those tables back to the main database with the LISTEN/NOTIFY system, not worried about multiple calls in short order.

Categories: Databases

Performance Tuning Websites

Download speed used to be one of the ways you could tell a real web pro from a graphic designer that knew how to make things pretty.  One of our excercizes used to be to make pages in a single table (this predated CSS), but carefully spanning rows and columns to move your content around, a pain to develop, but fast to render, since Netscape’s browser used to be terrible at “embedded tables.”  Obviously, this is archaic (along with worrying about 28.8k modem download speeds), but the concept of page that is fast to download and fast to render never went away.  When Google officially announced that it would take page load into account, people started to finally pay attention.  One of the best “starting points” it’s Yahoo’s Performance Guide.

However, the process of making a website fast is pretty straight forward:

  1. Home Page and other enterances: VERY fast and simple
  2. Limit third party items that might cause delays via DNS or download
  3. Prevent things that can get VERY slow from being on these pages

One and two are the ones most often paid attention to, but #3 is potentially the biggest impact and most ignored.  For example, adding gzip to your server cuts file transmission size, that saves time, and is nice, and can get 80 ms loads to 20 ms loads, but #3 is where page loads can move from 100 ms to 10-15 seconds.  For example, if you query the database to build your navigation, it’s easier to manage your navigation, but a hiccup at the database level (that locks that table), and your site hangs loading.  A solution like Memcached moves your “read only” data out of the database and into RAM.  You can still manage it in the database, and the site will update relatively quickly, but there is no reason to consult the database multiple times for information that changes infrequently.

Third party servers often get ignored, but you have no control of when they have problems.  Serving Javascript from a third party has the advantage that you don’t have to maintain it, but puts you at risk of the user’s experience massively degrading.  Consider removing as many third party elements as possible.  Solutions like Google’s Javascript based tracking for Google Analytics has the advantage of having near-zero impact on page load (except the transmission of the text – and the download of the Javascript library), but unlike images, tends to not have performance problems, and the site will load even if it is having trouble obtaining the Javascript library at the bottom of your page.

Getting the 50% – 200% improvements are great, but a real focus on the few things that can explode out of control will serve you better in the long run.

Open Graph Brings SEO Opportunities to Facebook

So Facebook’s push for Open Graph integration, where the “Like Button” replaces the direct link, creates new opportunities for businesses to focus on Facebook’s search mechanism.  Some initial tests indicated that it is possible to now optimize for Facebook search, i.e. bringing SEO to Facebook.

Facebook Open Graph allows one to connect their site to Facebook without full integration, simply using new Facebook Meta Tags and the Like Button (a snipped of Javascript code).  Facebook is tracking these likes and building a “graph” of the Internet based upon the recommendations of your social network, and now they are including relevant results when you search Facebook for something.

This creates an opportunity for companies that are bringing their brand to Facebook to get additional exposure through Open Graph, which creates an incentive to use the technology.  This is exciting, as the move to the “walled garden” of social media threatened to disrupt the open world of search.  In the past, users could recommend a page on their blog, creating a “link graph” for the search engines to use, but now it’s easier to just click “share” and send the link out to your friends that way.  Without this part of the link graph, the search engines are missing out on the recommendations that they build their systems upon.

Open Graph brings out the ability to restore this, even if Facebook is the only company taking advantage of it for now.

Copyright in the Digital Age

In Code and Other Laws of Cyberspace, Lawrence Lessig was over 10 years ahead of his time, but pointed to the fact that code, as in software, was as important to the realities of the online legal regime as the laws passed by governing bodies.  There seems to be an increasing understanding that copyright, as we know it, is becoming obsolete.

Our notion of copyright, the exclusive right of an author/creator to control distribution, makes less and less sense as the technology evolves.  Copyright, at it’s core, protected the author from exploitation from the owners of the printing press.  Without copyright, the owner of the printing press would be able to create multiple copies of a book, article, etc., without compensation to the original author.

Consider as a thought exercise, a novel writer, who brings a sample to a press owner, who agrees to share the revenues with the author.  Without copyright, that author would be able to collect from that press owner, but had no protection from dozens of other press owners taking that work and making copies without compensation.  Copyright protected the author.  Our founding fathers established limited protection, 14 years for registered copyrights, with another 14 year renewal available, which made the protection a limited time.  With extensions and treaty obligations, Congress has extended the protection to around 100 years, give or take, depending upon whether it is published, (70 years after the death of author, or for corporate works, 95 years from publication or 120 years from creation, whichever expires first).  This has insured that all works are protected seemingly forever.

However, in a digital realm, we are no longer worrying about the press owners, but everyone.  Everyone with a computer is capable of duplicating any work, so copyright attempts to regulate everyone.  In addition, the terms have been extended beyond anything reasonable, making the “public domain” trade-off merely theoretical.  For a television show released in 2010, it will be in the public domain in 2105, when nobody will have the ability to duplicate the product.  As culture speeds up, the lifespan of these works is measured in months or years, yet the copyright will last nearly 100 years.

In the computer space, we see the blatantly illegal Abandonware issue, where enthusiasts have archives of no longer available products available for download and possible emulation.  While one might question the literary importance of early computer games, they certainly played a role in American and global culture, and the copyright regime makes it likely that these works will never be available.  Publishers from the 1980s and 1990s are long gone, the copyright holders defunct or swallowed into larger companies, all with no interest in preserving the works of that time period.  For every game like Civilization with endless sequels (and presumably originals maintained and later republished as Flash games or equivalent), there are plenty of games that were exciting but the company went defunct, and changing architecture makes it impossible to maintain.

If I want to show my son the games of my youth, the laws of copyright may not apply (the disks/cartridges may be in a box at my parents house), but with no way to play them, the laws of code render them gone.  The copyright system simply has no way of maintaining preservation of our digital past.  Websites go up and down, articles disappear or are archived, and the only record may be a print out that someone grabbed at the time, threw in a box, and has no legal right to republish.

The intersection of law and code is interesting, because the code permits saving the file and ANYONE republishing it, while the law prohbits anyone from doing so.  Alternative, in the case of abandon ware, the law permits me to own and play my purchased copy, but doesn’t permit any reasonable way of actually doing so without the works of those flaunting the laws.

Napster may be long gone, but for over 10 years, nobody assumes an obligation to pay for anything, just choosing to for convenience.  Copyright is increasing a blunt instrument, simply at odds with how people publish and consume content.  Youtube lets anyone with an interest parody something, but leaves the enforcement of fair use to the increasing lawsuit nervous companies to simply take down something that uses a few seconds of clips.  The meaning of copyright needs to be reconsidered when everyone can duplicate, creating of content may be increasingly expensive, and our culture may simply be at the mercy of technology.

Decades of movies that will never be released in a digital format may exist in people’s VHS collection, but without a way to play them, they’ll simply be lost.  Culture is important, and who knows what future historians will be interested in when researching culture of the 20th and 21st century.  Some of our early writing samples are of mundane things, simply because they survived, and it is tragic if we simply litigate our creative history out of existence.  Current copyright is obsolete, and a new line needs to be drawn to preserve our culture and our rights.

Disney may not be interested in re-releasing Song of the South, but should they be allowed to keep it out of the nation’s cultural archive?

Follow

Get every new post delivered to your Inbox.