Thursday, December 8, 2011

How To Convert All Text To Lowercase In Google Spreadsheets

We consume a lot of table-like data from clients that has a lot of human error in it. Often this is big enough that manual fix-up isn't fast enough (more than 100) but not so big that some more elaborate solution is appropriate (millions of entries needing formulas and machine learning). It's just a static table of data that needs to get loaded into a Dictionary<string, string>.

When doing so we dump the data into Google Spreadsheets, perform some fixups, and then share it with the client to verify it's accurate. One annoyance is there's no way to just switch a column to lowercase directly, so here's one way to pull it off:

  1. Select the column, right-click, and click Insert Column Right.
  2. In the first cell of the new column enter formula =LOWER(A1) (or whatever the first cell is of the original column).
  3. Copy the cell with the formula in it.
  4. Select the column and paste - Google Spreadsheets is smart enough to adjust the row number for each pasted entry. You now have a column with lowercase in it - but it requires that column to the left to still be there.
  5. Select the new column again and copy.
  6. Right-click and click Paste Special > Values only.
This wipes out the formulas. You can now delete the original column. Obviously this applies to anything you could do with a formula in spreadsheets.


Wednesday, November 9, 2011

NoSQL - Where's It Going? Where Should It Go?

The NoSQL movement is either saving web platforms or a major nuisance, depending on what kind of developer you happen to be. Either way, the way we store data is shifting. There are the old stand-by Relational DBs like MySQL, Oracle and SQL Server, and then there's all the crazy new wave stores - BigTable, HBase, Cassandra, SimpleDB, etc, all falling under the general category of "NoSQL."

Each of these has their design features and focus, and have less in common than maybe an umbrella like NoSQL should allow. Then again, by defining them by what they are not, I suppose a ham sandwich could also be eligible for the NoSQL category.

In the simplest cases, all of these solutions allow you to get a basic job done: store rows of similar-ish data in a list. After that, things get crazy.

The first general category they differ by is performance. Some favor write performance. Some favor low-latency consistency. Some favor read performance. Some favor availability. It's unfortunate my choice of data store for my entire app impacts these factors. Really these are all strategies I'll need in varying amounts for different tasks my app performs.

The second general category they differ by is how you read and write data. Each generally has its own new-fangled API for accessing it. Some have libraries that let you pretend you're still using SQL, and tend to throw a lot of errors when you do anything interesting, let alone fancy. For example, Amazon's SimpleDB, while indeed simple, cannot handle relating data between 2 tables (which it calls domains). While it has a SQL-like interface that sits on top of its API, most of the SQL you're used to using will throw an error (like JOIN). What a nice cage to build an app inside of.

This failing of SimpleDB and some of the other NoSQL options seems to be from a bit of confusion about what NoSQL means. Although it generally means it's not a Relational Database, that doesn't mean data never has relationships. It simply means that it has no explicitly defined relationships in the database itself - really what people want is to never think about a Foreign Key again in their lives. Put another way, relationships are business logic that belong in code, not the database. Fetching data and how that's performed is the responsibility of the data provider, not the app.

It seems there needs to be a general standard for NoSQL databases that's defined by what it is rather than what it is not, and here's what I see developers really looking for:

  • Named tables
  • Allows you to submit arbitrary data into tables ("schemaless")
  • Does not enforce data relationships (not a relational DB), but...
  • Allows you to join tables in queries
  • Allows sorting and filtering
  • Scalable
That's what people are really looking for in a NoSQL database. It eliminates the upfront cost of schemas, and eliminates a lot of the performance cost of storing all those rows in a scalable way. The one big burden hanging out there is handling joins - but it's still something that can be accomplished with scalability in mind, and it can be done at the data layer so a master in the data provider can service requests for any data of any shape.

I put scalable last because the truth is that a lot of apps being built with NoSQL solutions are just hopeful. They have no need for something more scalable than a typical MySQL or even SQL Server Express instance can provide. But they do want to be done with schema management, and they want to design their app so it can handle the big time if it gets there.

There are some further features I'd like to see ideally, but don't have to be there to fulfill the basics of what these modern DBs ought to be:
  • Ideally: Allows querying with SQL
  • Ideally: Handles indexing of multiple columns, preferably in response to queries
  • Ideally: Lets you specify the strategy for a specific table's storage ("the engine" by today's terms)
  • Ideally: Handles sharding etc strategies for you so you can store all data in a single named table, even if it's broken down into many smaller tablets under the hood
I don't have a great name for where these solutions all seem to be headed. Schemaless Joinable Tables?

Monday, October 10, 2011

Google's Dart

Google's Dart looks pretty cool.
http://www.dartlang.org/docs/getting-started/

It borrows (and improves) the only thing I like from PHP - shorthand for variables and expressions in strings:
'Hello $name' replaces $name with the value of the name variable.
'Answer: ${a + b}' performs the expression a + b and swaps that result in.

It also allows both var and typed variables in the same app like Javascript and C#, and borrows one of C#'s best features, Lambda Functions:

num circumference(num r) => r * 3.14;

Finally, it doesn't have a heavy focus on making things private - nothing is private by default (you instead make something private by prefixing its name with an underscore). This is a somewhat odd syntax, but I think one of Javascript's enduring and under-recognized strengths is that everything is by default public. This makes Monkey-Patching broken-but-useful libraries possible, something that's impossible in Java and has caused enormous amounts of pain in numerous past Java projects I've worked on.

Looking forward to seeing where Dart goes next. My suggestion: Port the Closure library to Dart.
http://code.google.com/closure/library/

Thursday, September 15, 2011

Surviving Google's Blogpocalypse

I attempted to login to my Google Apps version of Gmail one day and was instead presented with a page I couldn't circumvent. Over 20 checkboxes, several tabs that didn't look like tabs, and a lot of confusing options. Eventually I was able to access Gmail again.

The next time I attempted to login to Blogger however, things went poorly. As it turns out this transition does not have a migration path for Blogger/Blogspot, so you have to migrate manually.

Migrating manually is not obvious or easy. Here's the steps I had to take - maybe they'll help others:
  1. Choose the option to create a new personal Gmail account for your blog. You'll need to go through the usual Gmail signup process where you create another username, another password to remember, and have to enter another arbitrary security question.
  2. Login with the new Gmail account. You can signout of your existing account, or do this in an Incognito window or another browser to skip the logout step.
  3. Now you have access to your blog again! ...but you don't want to have to use this other random Gmail account to edit it every time.
  4. Get to Settings > Permissions
    1. In the old look: Under the blog name, click Settings, then the Permissions tab.
    2. In the new look: Click the blog name, click Settings, and look under Permissions.
  5. Click Add Authors, and add your old Apps account - except - you can't just add it normally - the operation just quietly fails with no open invite, no indication the invite went out, no email, and no indication of an error. Instead, you need to use an alias for your account. If you have multiple domains associated with your account this is simple - use one of the alias domains. If you don't, you'll need to create an Alias for your user in the Domain Admin, then invite that Alias.
  6. Check your email for the invite, and accept.
  7. Come back to the incognito window (or, sadly, logout and login as the ephemeral Gmail account you created). Get to Settings > Permissions again, and now change your invited self to Admin instead of Author.
  8. You can finally return to using Blogger the way you always did before Google ruined your day.
I'd like to point out that this transition really should have been entirely under the covers - as a user, I login with my Google Apps email address to edit my blog. I want to keep doing so. That Google is going through a major systems changeover shouldn't require me going through all of this trouble. If the transition is going to be mandatory, it should have waited until all products could be migrated automatically. Instead a lot of non-technical users had to deal with this insane process that offers no support.

Google is notoriously poor at customer support. I actually recall standing at TGIF listening to a Googler ask the founders why customers are pushed to post their problems to Support forums no one reads, getting no response unless an employee happens to take it upon themselves to look into it, or it makes it onto the front page of Slashdot. Sergey Brin's actual response was, "Well we shouldn't resolve these issues by having a big customer service department. We should resolve them by writing better code." I really should've grabbed a mic and described a metaphorical situation in which a farmer starts closing the barn door after his cow wanders off, but alas.

This is the support thread for this problem: It's safe to assume no Google employee will ever respond to it, let alone read it. http://www.google.com/support/forum/p/blogger/thread?tid=239869e385664e6b&hl=en&fid=239869e385664e6b0004acf9193ad5a4

Tuesday, August 30, 2011

Kintera.org/Blackbaud.com infecting its users - on its donation page

I recently tried to donate money to a friend's charity. The page is hosted on Kintera.org, which includes a form to collect credit card info, and a Java applet that shows who else has donated recently. It uses a scrolling library they probably pulled off some untrustworthy website (I doubt it's the worse possibility - Kintera willfully infecting those making donations).

Unfortunately that scrolling library has 3 viruses, all of which act as Trojans to infect the user's machine and place them at the whim of a command and control bot network:

Java CVE-2008-5353.KM
Java CVE-2009-3867.GC
Java CVE-2008-3869.M

That's pretty embarrassing. The scroll page actually shows one page before you fill out your credit card info, so in the absolute worst case scenario, you view the page, click Continue while the infection is occurring, a keylogger downloads and runs, you enter your credit card info, and off it goes to as many as 3 bot network owners/users. Not cool.


Confidence indeed.

Monday, August 29, 2011

How to Root the HTC Evo Shift 4G

Sprint blocks their forums from viewing by non-logged-in users; this same information is posted at:
http://community.sprint.com/baw/message/329584

But you probably can't view it. Here it is reposted: How to root the HTC Evo Shift 4G.

You need the JDK installed:
http://www.oracle.com/technetwork/java/javase/downloads/java-se-jdk-7-download-432154.html

The Android SDK installed:
http://developer.android.com/sdk/index.html

And the HTC Sync software installed:
http://www.htc.com/www/help/ (scroll down to HTC Sync for all HTC Android phones and click Download)

Now follow these instructions:
http://forum.xda-developers.com/showthread.php?t=1185243

You'll need to cd into the directory where the Android SDK was installed, and then into the platform-tools directory inside that, in order to run adb and perform the other commands they ask you to run. You also need to move the 3 files they tell you to download into platform-tools (or, reference the path you downloaded them to in the commands you run - adb push).

This works on the current version as of this posting Aug 27, 2011: Android 2.3.3, but is unlikely to work in a future OTA update if there is one. Note that this only gives you temporary root but that's all you need to wipe out built-in apps you don't want. Note also that other temp root solutions like Visionary and permanent root solutions like ShiftRR will not work. Only the method linked to above will work on this latest OTA.

You can easily delete built-in apps while rooted by installing ES File Explorer from the Market (it's free), then go into Menu>Settings and check Root Explorer, then check Mount File System. Then browse to /system/app (you may need to change Home Directory to / instead of /sdcard to get to it). Press and hold on built-in apps you don't want, then tap Delete.

I deleted Amazon MP3, Nascar, NFL ("sfl-prod-release.apk"), Sprint Navigator, Sprint TV, and Swype (so I could install the latest). I doubt it's smart to get rid of the annoying Sprint Zone app because it appears to be how PRL updates etc get onto the phone.

You can prevent future OTA updates from putting all these apps back on by tapping Menu>Settings>Software Updates>HTC software update and uncheck Scheduled check. You can always explicitly ask for an OTA update if you want by coming back to this screen and tapping Check now.

Thursday, August 11, 2011

Stop Enforcement of Patents Without a Publicly Available Product


http://mobileopportunity.blogspot.com/2011/08/case-for-software-patents.html

He takes a long time to get to it, but I 100% agree:

restrict the right of "non-practicing entities" (patent trolls) to sue for patent infringement.

That's exactly what we need. Unfortunately he spends most of his time rehashing an old debate, briefly mentions this with no ideas on how to implement it (a tough problem), and moves on.

I think you could lay down some pretty simple rules. First, you could state that a patent cannot be enforced in court if what it protects is not available to the public either through your company or through a company that has licensed it. What this would lead to is a big company potentially stealing your idea while you develop it - but you can always finish the race to get it to market THEN sue for past damages. I think this is an acceptable outcome. It would prevent patent trolls from suing because they obviously have no intention of introducing a competing product, and the cost of doing so would be too high.

It would leave the licensing option open to some abuse though, and the definition of "available to the public" needs a tighter definition as well. But hey - it's a start. More than this guy tried.

He also leaves out one last negative impact of patents: They completely disclose to the world the details of what makes your product special. They protect you from the country against competition (and even then, probably only from small players in the country - big companies have a long history of kicking over the little guy, patents and all). I question whether the value of patents remains for small innovators (which should be the goal) when they have to fully disclose what they're patenting. It seems like you should be able to file a patent, get approved, but not have it go public until you give a say-so (basically when the product is released). There's no point in having the patent anyway until then (because you can't sue until it's available to the public), and making it known beforehand is dangerous - Chinese manufacturers love to just steal designs wholesale and give US companies the finger.

That's the final piece that's missing - worldwide protection after disclosure. That's really an enforcement problem. I suppose that's up to the PTO and the US as a whole to enforce - but only after we get our own **** together.