fallenpegasus | Entries tagged with geek

One new "feature" of the next release of Apple iOS is the integration with Twitter. To get these features, Apple Inc and Twitter Inc had to "partner".

Twitter has been just as "integrated" with Android for quite some time. And all Twitter Inc had to do was write an app, that used the standard Android API hooks.

In fact, my own Android phone doesn't even use the "official" Twitter app (which I find to be slow, heavy, and poorly done), it uses an even better and more advanced one called twicca. And the author of Twicca didn't need to "partner" with either Twitter Inc or Google Inc. He just wrote a better app, that again just used the public APIs.

It doesn't take a high-level corporate partnership, a 12 month product roadmap, and a heavyweight development cycle by employees of a set of large companies to add a deep and useful integrated feature to Android. It takes a single developer working in a cafe.

Current Location: AlohaHaus, Seattle WA
Current Music: SomaFM Groove Salad

Machines Plus Minds: E-Blobs and NoSQL options

Last year, when I was doing MySQL Professional Services, I encountered a client that was already using memcached. Something they said they were doing was they were caching the compiled bytecode of their PHP code in their memcached, which was a big win because they ran a large fleet of identical PHP based application servers. As soon as any one server encountered a given new piece of PHP, it would compile it and cache it, and immediately all the other app servers could use the same cached compiled bytecode, rather than repeat that work. They had recently changed to this approach, from caching the compiled bytecode on the disk of each app server.

I thought that was really neat, and kept digging elsewhere into their performance and scaling issues.

I had just assumed that this was some open source project, a modification or module to an existing PHP bytecode compiler / cacher / accelerator.

Except, it seems to not be. I've spend a couple of days now googling and reading up on the various "PHP accelerators", and they all appear to cache to disk or cache to local shared memory, but I can't find a reference anywhere to coupling one with memcached.

Am I just missing something, is my google-fu failing me, is this something this shop had written from scratch?

Do any of my readers know?

Current Location: Joe's Bar and Cafe, Capitol Hill, Seattle WA

In Trond Norbye's blog entry Testing libmemcached on EC2, he refers to "Someone pinged me yesterday about a problem he was seeing when he tried to run the test suite on Jaunty Ubuntu.".

I am that "someone".

I ended up filing two bugs on launchpad against libmemcached 456080 and 456084, and just submitted a branch that fixes the second one.

Sunday morning, in the wee hours of the morning, while packing, I discovered that I didn't know where either my DL or my Passport is. After spending an hour searching, I gave up, and decided to throw myself on the mercy of the TSA. Which I did at the airport. The TSA supervisor checked my credit cards, FlyClear card, corporate ID card, and CostCo card, and then stamped my boarding pass. I only need to do this 3 more times on this trip, and then do a very deep search of my room when I get back.

After I landed at SJC, I discovered that Tim Lord has shared the flight with me. If I had known that ahead of time, I would have let him share my taxi ride from Capitol Hill neighborhood to the SEA airport. As it was, I let him share my ride to San Jose Convention Center. I was actually too early to check into my hotel room. So instead I checked in my luggage, and then went off to find the Postgres Day.

Meeting up with open source community geeks, watching lightning talks, hacking, taking pictures.

I accidentally left my jacket in one of the meeting rooms, and ended up having to bid for it in an auction. So now I owe $20 to the Postgres Foundation. :)

Monday, I spent hacking and geek socializing, hanging out in the speaker room. Some of the tutorials, but I didn't end up going to any of them. Lunch was at the Good Karma cafe, which I had stumbled across the last time I was in San Jose.

Also on Monday, I hooked up MontyT with someone who may have a solution for automating the build of Drizzle on Windows machines, without manually maintaining Visual Studio Project files, or porting Autotools to Windows.

Tuesday, I went to the Gearman tutorial, and while in it, started working on http://forge.gearman.org/ and also started implementing a bunch of basic "plumbing" gearman workers that need to exist, to link together filesystem, mogilfs, couchdb, amazon web services, memcached, and Erlang.

Right now, I'm sitting in the Ignite talks. BrianA just won an Google O'Reilly Open Source Award.

I see three stages in the adoption and the evolution of the adoption of cloud computing:

1. Taking an existing application, and transferring it as is into a cloud provider. This is "forklifting".

2. Taking an application, and wrapping it up in cloud based provisioning and backup. This is the business space that RightScale and Zmanda help with.

3. Developing applications that implictly assume they are running in a cloud environment. I see a lot of talk of this, but not all that many running yet. But very soon they are going to be everywhere.

Current Location: Santa Claura Convention Center

So it turns out that the answer to the question "when computers and television converge, what will we have" is "computers". link.

So the next question is "when computers and telephones converge, what will we have?"

I think the answer is going to also going to be "computers". Just very portable ones.

I wish I could say

	printf("%*.16xs", n, p);

where n is a size_t and p is a void*, and it would output n bytes as hex.

And I want to be able to say

	printf("%*qs", n, p);

and it would output n bytes as a C-style backslash quoted string.

Yes, I can write functions that do all this, but I have to worry about buffers, allocate buffer space, and so forth. Plus its a lot of noise and clutter, for just outputing potentially unprintable strings, often just for logging and debugging.

Current Location: University of Tasmania, Hobart, Tasmania, Australia

A feature of the MySQL server that is used a lot, and yet is a source of much user confusion, code complexity, and multiprocessor lock contention, is logging. Query logging, slow query logging, and the new 5.1 feature, "log to table".

I've removed most all of that stuff from Drizzle (and removed two or three sets of now-no-longer-necessary mutex locks in the process), and replaced it with hooks into a logging plugin subsystem, and have implemented two plugins for it. One logs to a file, and the other logs to syslog.

The output looks almost completely unlike the current MySQL logging. There are no hash-prefixed pseudocomments, for one thing. And there is no distinction between the query log and the slow query log. Queries get logged, and the amount of time each query takes gets logged with it. This subsumes the "micro-slow patch" that is spreading around in the MySQL legacy world.

The current format is pretty "programmer-centric", but concrete advice and patches are welcome.


   snprintf(msgbuf, MAX_MSG_LEN,
     "thread_id=%ld query_id=%ld"
     " t_connect=%lld t_start=%lld t_lock=%lld"
     " command=%.*s"
     " rows_sent=%ld rows_examined=%u\n"
     " db=\"%.*s\" query=\"%.*s\"\n",
     (unsigned long) session->thread_id,
     (unsigned long) session->query_id,
     (unsigned long long)(t_mark - session->connect_utime),
     (unsigned long long)(t_mark - session->start_utime),
     (unsigned long long)(t_mark - session->utime_after_lock),
     (uint32_t)command_name[session->command].length,
     command_name[session->command].str,
     (unsigned long) session->sent_row_count,
     (uint32_t) session->examined_row_count,
     session->db_length, session->db,
     session->query_length, session->query);

How to interpret this output, and how to turn on, control, and use logging, will be described in additional posts.

Current Location: Working from home, Seattle WA

I've been involved with the Drizzle project since very soon after it began, working on it on nights and weekends.

That has just changed. As of today, I'm no longer a MySQL Professional Services consultant, instead I'm part of a new division of Sun

Much of my time is to be spent working on Drizzle, with a focus on plugin interfaces and making it work well in Extremely Large distributed environments.

I will be blogging heavily about what I am doing. How I sort that blogging out between my personal LiveJournal, my (mostly unused) Sun employee blog, and maybe some other blog system, remains TBD.

This is going to be fun.

A friend suggested the creation of a "Technical Professional Purity Test", and suggested two questions:

Have you ever have you ever implemented a solution you knew would doom the client?

Have you ever yelled at your manager for interrupting your workflow?

Please, suggest more!

So I have a string that has a binary data structure in it, and I want to use Perl's unpack() on it

The binary structure is a 32bit int, followed by N 8 byte chunks. And I don't have quadwords in this build of Perl, so I cant treat them as 64bit ints.

And the following unpack string doesnt work: "L(a8)*".

What would the correct unpack string be?

Both Skype and Adium use Growl to announce connections by people on my contact list. The notification bubbles look exactly the same.

When you click on an Adium Growl announcement, the focus flips to an IM window with that user, so I can type at them immediately. But when you click on a Skype Growl annoucement, it just flips over to Skype, and the focus is on whoever I Skype chatted with most recently, not the person announced by the Growl bubble.

This is very confusing.

The most common oops case is when I see my MySQL PS scheduler come online when I have a message for her. A couple of times, I've seen the Growl bubble, and just instantly clicked it, typed my message, and hit return, and sent a confusing message to someone else. Fortunately, there have been no major mavs or client NDA breaches. But I do wish that Skype would fix that.

I also wish that both Adium and Skype would let me selectively enable Growl notifications on a group or individual basis. I like the Growl updates, but I don't need them for everyone on my contact list.

One of my gigs in the past few months was a company that own a number of TV stations in a certain regional market. (No, not ClearChannel.)

My job was teaching them how to scale out a website that was very popular, and getting more so. It looks like several dozens of sites, each with a different URL, different branding, different style sheet, and superficially different structure. But under the hood, they were all the same site, and about to be coming even more so.

What this site does is interesting. It's the morgue and archive of all their news stations, exposed to the public. Each station's local news stations video content, with transcripts and metadata, along with user contributed videos, and user comments. The only real difference under the hood between "official" content and the user content was the byline. Basically, they are running their own YouTube, and letting the viewers participate.

This working model is very shocking to this industry, and was hard to swallow even by some of the technical people I was meeting with.

All of this is interesting, but is only leadup to what I really impressed me.

They realized that "the competition" wasn't other TV stations. In fact, in some markets they directly and indirectly controlled both halves of the legally mandated duopoly. The competition was YouTube and local news blogs.

And they learned that their least popular video content were not the glossy edited "news program segments", filled with "journalistic" "context" voiced in by their "reporters". The most popular content was the raw footage, unedited, and often without sound. Second place was the user videos, and in a distant third was the edited "news program segments", filled full of edits and "journalistic" "context" voiced in by their "reporters".

Another interesting bit was that the site was starting to feed back to the stations. If something newsworthy happens, and some local bystander happens to catch it on their camera or phone, and uploads it to the site, they may use that video on the broadcast, instead of sending out a "professional" news crew.

Current Location: AA1035, somewhere between DFW and SJC
Current Music: aircraft engines and a crying baby

Over in CNET, Matt Asay has posted an article The open-source job shortage, talking about large enterprises' need for developers with deep MySQL experience.

While he is correct about the need for talent with that skillset, there are plenty of effective solutions.

A number of months ago, Harper Reed asked me where he could hire MySQL talent, and I told him to take his existing staff, and run them thru MySQL training. That seems to have worked for him. That's now my stock answer when people ask where they can hire MySQL talent.

When you need to go up to the next level, get and read the book High Performance MySQL, Second Edition. The book is basically several of the very best MySQL people in the world, reduced to readable book form. If your staff will read that book, they will become people with "deep MySQL experience".

If training up your own staff is not on the roadmap, and you need someone to come in for a week to analyze and design a new system, or to do performance fixes to an existing system, you have many choices. There is, of course, Sun MySQL Professional Services. Or you can go to folks like 42SQL, or Proven Scaling, or Percona, or Open Query.

Or say you want operational ongoing DBAs, or have a panic situation and you need a DBA right now, there are outfits like Pythian and Blue Gecko. And if you are a hybrid shop, these two companies do both MySQL and Oracle.

In short, "using MySQL is risky because we can't find the talent!" is a solved problem.

Now, you might not want to pay the talent, but that's a different problem.

(Disclaimer and disclosure: I work for Sun MySQL Professional Services.)

Current Location: Capitol Hill, Seattle WA

Dell has obtained a trademark on the term "Cloud Computing".

http://tarr.uspto.gov/servlet/tarr?regser=serial&entry=77139082

People have been talking about computing in the cloud for years now, and network designers have been using a cloud icon to indicate "services on the internet out there somewhere" for over a decade.

This trademark needs to be killed, it born generic.

Why may LIMIT only take a fixed number? Not even a user variable, let alone an expression.

Wouldn't it be useful to set a user session variable that always gets applied as an implicit limit value?

Both of these things could be done in Drizzle.

you should start discounting Subversion because of it’s questionable code history. The Microsoft product is clearly branded, and it’s code has a clear source that is fully accountable for it’s intellectual property. The Subversion code is a mishmash of code pasted in from all kinds of anonymous sources on the web. ... There is literally no reason to not use the Microsoft CodePlex product, and many reasons not to use the buggy and possiblly illegal Subversion/Sourceforge mashup. link

Literally no reason... Other than it is buggy, it is slow, it is beta, development is very slow, it locks you into a Microsoft-owned and dominated site, you can only get support from Microsoft, only for as long as Microsoft is interested in supporting it, you can only get bug fixes and new features on their schedule, and it only works with a paid product. Yeah, other than those reasons, literally no reason at all.

I've just checked in a small update to the s3-tools

I've removed the dependency on the Perl package XML::Simple, and replaced it with calls to XML::LibXML, which will have already been loaded because Net::Amazon::S3 depends on it.

I really dislike doing what should be a minor update via CPAN, and having a cascading set of added dependencies cause CPAN to pull in the whole world. So I shall swim upstream, and remove unneeded dependencies.

Anyway,

The tarball can be had at http://fallenpegasus.com/code/s3-tools

The Mercurial repo is at http://hg.fallenpegasus.com/s3-tools

In a recent TechCrunch article about Amazon Web Services, it's revealed that "the biggest customers in both number and amount of computing resources consumed are divisions of banks, pharmaceuticals companies and other large corporations who try AWS once for a temporary project, and then get hooked."

This is not a big surprise to me. Last year sometime, at some random geeky event, I was explaining why I thought AWS was so cool, and one of the people I was explaining it to worked for a local insurance company, and he got very excited. Apparently, at this company, they would have huge stacks of compute servers that would lie fallow 28 days out of the month, but when it came time to run the monthly math to recalculate various risk reward calculations, they would max out their capacity, and were constantly begging for more. They would very happily pay a premium to have those compute instances on-demand, by-the-drink, and not have to pay for them the rest of the month.

I wouldnt be surprised if they are now one of those AWS customers.

Profile

Mark Atwood

keybase

December 2022

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Syndicate

Page Summary

Style Credit

Style: Greyscale for Modish by twtd

Expand Cut Tags

No cut tags

Page generated Jun. 17th, 2025 08:50 am

A Place in the Forest

Entries tagged with geek

Android vs iOS, regarding Twitter

Machines Plus Minds: E-Blobs and NoSQL options

PHP accelerator cache that uses memcached?

Finding and fixing bugs in libmemcached

OScon so far

Thoughts on moving to Cloud Computing

tv+computers=computers. telephone+computers=?

Something I wish that C style printf format strings could do

Logging in Drizzle, Part 1

Drizzle is now my job

"Technical Professional Purity Test"

Dear Lazyweb, Perl unpack question

Growl, Skype, and Adium

A local TV news company that just might survive the Internet

In response to "The open-source job shortage"

Dell trademarks "Cloud Computing"

Thoughts while reviewing MySQL documention

Talk about drinking the FUD Kool-aid...

I've just checked in a small update to the s3-tools

AWS used by banks. Not a surprise to me.