mamamusings: October 26, 2005

elizabeth lane lawley's thoughts on technology, academia, family, and tangential topics

Wednesday, 26 October 2005

my public apology to adam smith of google

So, I owe Adam Smith an apology. I was awfully snarky in my blog post last night, and somewhat unfair in my characterization. He was gracious enough to stop by to say hello this morning, after having read my post, and I apologized to him then. But if I’m going to ding him publicly on my site, I feel as though I should apologize publicly, as well.

First of all, as many people pointed out to me this morning, he’s most definitely not over 40 (while I cannot authoritatively confirm his gender, I’m still fairly confident that he’s male…).

Second, as someone representing his company, he’s under significant constraints in terms of what he can say. When I went through employee orientation at Microsoft, I was warned many times about how quickly people would distort what I said or wrote simply because of my affiliation with the company. I was skeptical, but since then I’ve seen first-hand how that does indeed happen, and I can’t fault Adam for being cautious in his responses, and sticking close to the party line.

Finally, I have to give him (and Google) huge props for being here, and engaging in the dialogue. He’s weathered a lot of criticism gracefully, and that’s not easy to do even when you don’t have hundreds of people watching you.

Posted at 10:16 AM | Permalink | Comments (2) | TrackBack (0)
more like this: conferences | unclassifiable

internet librarian 05: google debate

Rich Wiggins squares off against Roy Tennant in a debate over “Google: Catalyst for Digitization or Library Destruction?”

Rich starts off, and is utterly charming. Some funny starting slides, hard to capture in print because of their visual impact.

Starts by talking about a similar debate they had 4 years ago. (The slides are dense with bullet points now, and I’m sitting where it’s hard for me to see the screen, so I’m not going to try to transcribe them. Later I’ll look for a pointer to the presentation online.)

How many bytes are in the LIbrary of Congress? This is a non-trivial question, with lots of technical aspects. You can’t gloss those aspects (resolution, color, etc) because you’ll end up wasting effort. Rich cites Brewster Kahle’s estimate of 20 terabytes.

Rich says it’s becoming so inexpensive to capture full-text and images that complete digitization is becoming realistic. Disk space is cheap, scanning technology has improved. He asked google what they’re using, and they wouldn’t answer. (Color me shocked…) I wonder whether Microsoft will be more forthcoming, considering their partnership with OCA. I hope so. [add musing on google’s secrecy here]

Refers the comment last night by Stephen Abrams that we spend more money getting abook through ILL than we do to buy it. (That’s a really interesting thing to think about.)

There are a bunch of straw man arguments here. He dismisses the preservation argument—we have better access, since you can still get the stuff online after a fire. (But what happens when the power goes out? That happens a lot more often…) Doesn’t address the question of what happens when data is stored in proprietary formats—do we know what format Google will store this information in?

His bottom line, “Google Print has taught us to ‘think big.’” (hmmm. does the period go before or between the single and double quotes there?)

Argues that this vision of digitization will have to be done by a forward-thinking company — not by government. It has to be a company. (He claims that Google invented Ajax!!!!) Mocks Microsoft, saying they’re playing catchup, and not very well. “Hmmm…Google’s going to digitize millions of books? We’ll digitize 150,000!”

Now it’s Roy’s turn. Starts out by saying that his bottom line is “more access is better.” He thinks it’s great that Google’s digitizing stuff, that OCA is doing it, that libraries have been doing it for decades. There’s a lot of room for everyone to be involved. Says he’s going to try to be provocative, and starts out a halloween-themed slide that reads “Google: Devil? or Merely Evil?” (I didn’t get a photo of this, but would love to get the slide from him.) Says he’s going to talk about the scary monsters that he sees lurking in this project.

The first monster: the fair use problem. He’s concerned about Google trying to shield themselves with fair use. Because this has pulled the issue into the courts, it has the potential to result in restriction of fair use rights for everyone, including libraries.

The second monster: Closed access to open material. For example, there are many copies of Call of the Wild that are freely avaialble. But when you go to Google Print, you won’t know that—you’ll see the reprinted, proprietary version from a publisher, without an indication that it’s in the public domain and can be found from other sources. “And to add insult to injury, they give you links to buy the book, but no links to libraries.” He’s been assured this will change, but it hasn’t happened yet, and there’s no guarantee that it will.

The third monster: Blind, wholesale digitiazation. He’s not so sure this is a good thing. Large collections in research libraries are choked with out-of-date crap, so that their collection numbers are high enough to keep them in their “tier.” Also, because copyrighted information is more difficult to get to, people will rely on old, out of date information because it’s free and easy to get to. Is this a good thing? (This is a great point that I haven’t heard mentioned before.) OCA is more focused on selective digitization—for example, American literature.

The fourth monster: advertising. How long before we see ads for antidepressant medication next to Hamlet? Google’s window of opportunity to do “good things” will be constricted by their responsiblity to stockholders.

The fifth monster: secrecy
The agreements between Google and libraries have been largely kept secret. Before the announcement, the Google libraries could not even talk to each other. Michigan revealed theirs (but not until a Freedom of Info Act request forced it, and months after the project was announced). Rumor has it that UM has the best agreement from the library perspective, and that other libraries are agreeing to much less onerous terms. This is a hot button for me. One of the things that I really like about Microsoft is the extent to which its researchers regularly collaborate, publish, and present outside of the company. If Google’s intent is purely philanthropic, why does the commitment to “provide access to the world’s information” stop at their front door?

The sixth monster: longevity.

Now Adam Smith gets a chance to respond. Flashes a charming grin, and says “I’m not that dangerous, am I?” :) (This is what scares me most about Google. Their people and their products are indeed so seductively charming, it’s easy to take their claims of purely philanthropic motivation seriously.)

He encourages feedback and criticism—says that’s how they make their products better. They launch things quickly so they can get feedback quickly. They walk a difficult path in trying to make many parties happy. Their goal is to make information more accessible, not hidden in library stacks. Says he’ll be here to answer questions.

He’s asked about the scanning process—they’ve developed a proprietary non-destructive scanning process, but are not at liberty to disclose that. Someone asks about privacy, Adam refers them to Google’s privacy policy. Someone else asks if it’s true that one of the libraries requested that only manual page turning be part of the scanning, and he again invokes “no comment.”

I ask about the disjoint between the stated policy of helping the world by making information accessible and the veil of secrecy surrounding everything they do, and he’s unable to respond—says he’s only been there two years, and isn’t really familiar with the reasoning behind their policies on disclosure. I express surprise that he hasn’t asked for clarification, since I would think he’s asked this fairly often, and he says he’s never been challenged on this in a public forum before. I’d love to think that’s not true, but I suspect that the Google mystique, which they cultivate so very well, has a lot to do with that.

Lots of discussion, not all of which I capture mentally (let alone here on the screen).

Posted at 10:31 AM | Permalink | Comments (1) | TrackBack (1)
more like this: conferences | librarianship | search

internet librarian 05: search engine choices

Greg Notess and Gary Price, two genuine experts on search engines and our choices.

Greg and Gary both start out by saying “Google’s not the only answer.” It’s the job of information professionals to know all of the options, not just the most popular one. Gary notes how hard it is for anybody but Google to get the word out about their products.

Current web search engines with unique databases
* AskJeeves
* Google
MSN (says librarians really should pay more attention to this!)

meta engines
* A9
* clusty/vivisimo
* dogpile (one of the few that hits all 4)


Greg says that he doesn’t like to start his searches with Google. As a reference librarian, if he starts with something other than Google it boosts his credibility with patrons—he’s not just doing the same thing that they do! :) Shows the example of a discussion list posting that was only available on Yahoo (not on Google or MSN). If you care about comprehensivenss, you have to be willing to use multiple sources.

AskJeeves give you a different kind of relevance view. Says they’ve come the farthest on “quick info” on a search. Shows a search on “Chicago” as an example. He and Gary then also show a search on “the Beatles,” which gives you a variety of useful “expand your search” options. They note that AskJeeves have reduced the number of ads on their pages, which many people don’t realize. (In contrast to other

MSN Search is up next. Acknowledges that not all Microsoft products are best of breed. BUT…MSN search is no longer powered by other people’s indexes, and right now they’re doing a better job than anyone else of keeping things fresh. They also mention that MSN Search gives you free access to Encarta content. You get two hours of access each time you do a search leading to Encarta (can limit to Encarta only, or let it be part of the overall results). They haven’t promoted it, but it’s a feature that librarians should be promoting—particularly as a comparison to wikipedia.

Shows MSN’s search builder, which is great for showing people how to build complex searches—uses drop-down boxes and sliders for ranking. They don’t show; will have to ping them about that, because I suspect they may not be aware of it.

Next up is Yahoo; they recommend that people use rather than, to avoid clutter. Shows that you can edit the tabs (there’s a tiny “edit” link up there…) to the kinds of vertical/specialized searches you want. (That’s cool! I didn’t know that!) If you’re logged into Yahoo, the settings will follow you. In advanced search, they show off the creative commons option, as well as their “subscriptions” search, which is extremely interesting (Mary Ellen mentioned this on Monday, too). He shows the blog search stuff that’s been added (that’s another post that’s brewing for me; I’m extremely unimpressed by their implementation of blog search). Then they show Mindset, as well—again, I don’t love that shopping/research is the only axis. Shows the shift from “did you mean”

Complains about lack of transparency in how search engines (especially Google) works.

Damn. I need to go to the airport, and will miss the metasearch and vertical search discussion. Hopefully someone else will blog it…I’m outta here!

Posted at 11:07 AM | Permalink | Comments (2) | TrackBack (0)
more like this: conferences | librarianship | search

internet librarian 05: parting thoughts

This is the first conference I’ve attended in a long time that’s made me want to blog non-stop. And it’s not insignificant that it’s a library-focused conference that inspired me.

When I took a job teaching information technology, instead of a job teaching in a library school, I assumed I was leaving my library roots behind. I wasn’t able to justify travel to library conferences, and I felt my ties to the professions starting to dissolve. But over the past several years, with the rise in social computing as a theme in technology, I’m delighted to find the threads weaving back together. Suddenly, libarians are talking about the same things that technologists are talking about—managing information, collaborative filtering, metadata and classification schemes. And I’m in the wonderful position of having a legitimate foot in both camps.

At the speakers’ reception last night, Michael Stephens told me he was preparing to do a survey of librarian bloggers, and asked me if I’d participate. It was lovely to be thought of as a librarian in the present tense.

And now, as I fly over Utah’s extraordinarily beautiful Great Salt Lake (I’ve never seen it before, and am grateful for the clear skies that are allowing me this bird’s-eye view…photos will be on Flickr soon), I’m thinking about how to keep these bonds a little tighter in the future. I really should touch base with some of the faculty I know at UW’s I-School, and see about maybe giving an occasional guest lecture over there. And I’ll be working hard on the folks at MSN, whose absence was notable this week. Google’s not making the mistake of ignoring libraries in their quest to win the hearts and minds of searchers, and MSN shouldn’t be making it either. If that’s the only tangible legacy I leave behind, it will have been a year well spent.

Posted at 8:20 PM | Permalink | Comments (2) | TrackBack (3)
more like this: conferences | librarianship | microsoft
Liz sipping melange at Cafe Central in Vienna