More Wikipedia Hub-ub

December 16th, 2005 by Potato

Penny Arcade had a great commentary today on the recent Wikipedia issues, with a particularly brilliant comic to go with it. They seem to have a slightly different view of Wikipedia than I do, and I must admit their points are valid. I still think that the project is worthwhile and important, but it certainly isn’t going to turn into some shining gem containing the sum of all human knowledge any time soon. A quote I just loved and had to repeat:

What you’ve proposed is a kind of quantum encyclopedia, where genuine data both exists and doesn’t exist depending on the precise moment I rely upon your discordant fucking mob for my information.

I also have to agree with this point (and I think I already said something about peer review and expertise in my first article on the subject, below).

The fact of the matter is that all sources of information are not of equal value, and I don’t know how or when it became impolitic to suggest it. In opposition to the spirit of Wikipedia, I believe there is such a thing as expertise.

Now, must write faster, thesis will eat me…

Wikipedia, and General Credibility

December 14th, 2005 by Potato

There’s been a bit of a scandal lately involving Wikipedia, one of my favourite sites on the Internet. Actually, I wouldn’t call it a scandal so much as a misunderstanding that the media has blown out of proportion.

You see, one of the great things about Wikipedia is that anyone can edit it. That means when I find an annoying typo or innacurracy, I don’t have to lament about emailing the editor or just letting it slide for the next user, I can easily fix it myself. It provides a huge labour force to research, edit, and refine articles, since anyone interested can help out. This is also handy for removing bias, as progressive edits by different people can average it out. (Note that there is still a bias present, that Wikipedia acknowledges: it’s inherent due to the fact that only people with internet access and time go there, as well as interest, which means that the majority of editors are from North America and Europe, and moreover, are of middle class or higher income brackets).

However, it’s also a great source of trouble. Vandalism is a problem, as anyone who wants to defame someone or just break something for the sake of being an anarchist can edit a site. Worse yet are the people who think they know better but don’t, such as the conspiracy theorists who insist on changing things to fit their version of the truth, such as adding references to the illuminati in famous assasinations. It’s also hard to judge the accuracy of details, as people are donating their time to help, which makes them prone to laziness in terms of proper referencing. It also leads to stylistic problems, as you have different people writing different segments of long articles, with no unifying editor (let alone the dissonance between articles). Mistakes corrected in one part may not be corrected in another, leading to internal contradictions (I corrected one recently on someone’s biography. “At age X in year Y…” didn’t add up with the given birthdate, which turned out to have been corrected from the original author’s mistake…). This may lead to “descent towards mediocrity” as continual small changes “average out” what may have been a brilliant article — the sort of problem you get with group-think or committee reasoning.

I mention committee reasoning in particular since when two or more authors disagree on what an article says, there isn’t an expert’s panel or peer review board on Wikipedia to resolve the argument — that is, there’s no appeal to authority available. Often a discussion page will be created where users can argue the points being presented, and often arrive at a compromise. While it does work fairly well, there are cases where it might be preferable to have experts available to resolve these issues, which may eliminate some contradictory footnotes and what-not (though choosing experts, especially in an online only environment, is always a dangerous prospect).

Despite all that, it works for what it is. There is a vast amount of information there, and most of it serves as a decent introduction, often with some pointers for deeper reading. If you notice a page has been vandalised, you can use the “page history” to read an earlier version (and even repair the damage yourself that way, with minimal effort). And the random page function is a terrible, terrible way to encourage procrastination.

Yet with this recent scandal (I don’t have a single good link to refer you to, try google) has people (well, reporters) acting like the sky has fallen or something. Especially with the way they talk about memos being passed around newspapers telling journalists not to use it as a source for articles, and teachers telling students in classrooms not to use it for their papers. I find the whole thing a little silly, and want to pull out one of my favourite quotes from Douglas Adams (I have a lot of favourite quotes from that man).

Of course you can’t “trust” what people tell you on the web anymore than you can ‘trust’ what people tell you on megaphones, postcards or in restaurants. Working out the social politics of who you can trust and why is, quite literally, what a very large part of our brain has evolved to do. For some batty reason we turn off this natural scepticism when we see things in any medium which require a lot of work or resources to work in, or in which we can’t easily answer back — like newspapers, television or granite. Hence “carved in stone.” What should concern us is not that we can’t take what we read on the internet on trust — of course you can’t, it’s just people talking — but that we ever got into the dangerous habit of believing what we read in the newspapers or saw on the TV — a mistake that no one who has met an actual journalist would ever make.

So yes, Wikipedia may have more flaws in that regard than other mediums, since you have a higher chance for someone setting out to intentionally misinform people. But I don’t think it’s fundamentally worse than other tools available: old newspaper articles may contain poor information, and corrections to those are very difficult to track down (I know I’ve seen a few articles in various papers publishing known urban legends, which indicates that at least some reporters do not do very good research on their stories), and regular encyclopedia entries are usually not as in-depth nor as useful in terms of providing “further reading” resources. Moreover, they can become dated very quickly.

In other news, I’ve learned the arcane secrets of how the URLkeeper works. It simply creates a full-screen frame for www.holypotato.com and then loads the webpage within. That’s why any links you click (including external ones) keep the holypotato domain in your address bar. Pointing to subpages does work within the schema, which is handy if I want to set up links that won’t change if/when I move the site, but they won’t refresh in your address bar (for example, www.holypotato.com/favicon.ico will take you straight to the bookmark icon I made). What this means is that I can make it so that external links aren’t affected by URLkeeper if I simply include a target=”_top” part in the code. I’m doing that for all the new links I add (if I remember them :), but I’m far too lazy to retroactively change the old link. I’m going to fool around with the sidebar to see if it will work for those, but I have my doubts.

Some Simple Math

December 13th, 2005 by Potato

First, from a conversation with Bug earlier today:

There’s a paradox, maybe you can help me with…
If Pi is infinite, that means that logically, EVENTUALLY it HAS to repeat, thus making it not infinite.

Ah, not true.
Consider a number with the following pattern:

3.14114111411114111114

It may have segments that look like repeats of previous segments, but at no point can you circle a bit and say “now the rest is just this part over and over”. The pattern may be apparent, but it’s not actually the same thing over and over, no matter how far out you go. It can be infinite without being degenerate… that’s what makes it transcendental.

I got ya… but infinite… man… that’s infinite. Doesn’t it have to repeat eventually? I mean if it goes on forever? [comment: this is more of a philosophical aside on Bug’s part, and actually came between these two quotes] It’s like… if the universe is infinite, the molecules have to line up exactly the same as they are here somewhere far away, thus making a duplicate of this world, and if that happens and it’s infinite, then there are infinite parrallel worlds.

No, because as long as it goes on, it has an infinite number of ways to remain unique (an infinite number of digits, and 10 different ways to fill each slot).

Infinite just means never-ending, not necessarily all-encompassing.

Consider a circle. Draw a smaller circle just inside. And other smaller circles (or if you prefer, Russian dolls) within. You can keep going on with an infinite chain of progressively smaller circles, but they all still fit within the finite space of your first one.

Some even more simple math to ponder. Right now my server is consuming something like 75 W of power (since the drives aren’t being taxed and all that. The power supply is rated to 220 W, but it’s just running low-processor intensive things. Doesn’t even do any drive seeks from what I hear). That’s 75*24*30 = 54 kWh used in a month. At about 5 cents/kWh, it’s costing me about $2.70/month to run. It is a bit of a pain, as I experienced earlier today; since it’s a crappy old system that I don’t much mind opening up to the ravages of the internet, it needs to be reset every 4-7 days, or it’ll just run itself in circles until there’s no more memory left and lock up. That’s a minor annoyance, but it also means that if I ever go on vacation for more than a week, I can expect my webpage to stop serving itself before I get back.

The main goal of this setup was to give myself unlimited access to the Apache, MySQL, and PHP tools I’d need to get a site up and running, and I have partially done that — I actually haven’t learned much in the way of useful things, more about how to get the programs themselves running. Anything I’m using is a pre-written script. I still have no idea how to properly set up a MySQL database or access it via PHP. But that’s beside the point, as far as hosting my webpage goes, I think I know enough about WordPress and the other software to be able to use a real host. Based on the costs/pain in the butt, ideally I’d like to find a host in the ~$6/mo range (that is, that’s the price point where I think it would be better to switch to a real host rather than what I have now). So far the lowest I’ve seen is ~$10 CDN (aside from the free ones, which may yet be an option). Let me know if you have any recommendations, and I’ll also say I have no problem giving referrals if your service has a promotion for them. (Likewise, if anyone is moving and setting up a new Rogers account, feel free to tell them I referred you so I get a free month :)

Back Up

December 13th, 2005 by Potato

Minor outage today, server stalled on a virus update and needed to be kicked. I think it was idling for about 4 hours there, but I doubt anyone was trying to connect anyway.

Speaking of trying to connect, I found out I had another reader today. At this rate of growth it’ll only take a month or two before people who I don’t even know start to visit. The hits counter reached a one-day high of 24 yesterday — though I should temper that news by saying that because of the way the URL keeper works, the hit counter is triggered by every page (i.e.: reading the main page, then going to the comments page, or the archives, and in some cases, may continue to trigger if you follow an external link). It’s not, however counting my own hits, since I’m tunneling behind it.

Much Hate For Rogers

December 12th, 2005 by Potato

I like the internet. I’m an internet junkie. In fact, odds are that you are yourself using the internet to access this very page!

For those of you unfamiliar with the internet (for instance, those of you who have been handed a printout of my website by a kind, and yes we agree, a slightly condescending relative), let me introduce some very basic concepts for you.

First off, there are files on computers that are transmitted so that you can read them. This is how information is shared. It’s a very basic concept. The transmission involves electrical signals running on conductive wires (or pulses of light if you’re really fancy, or radiowaves if you can’t stand clutter), and the equipment to run all that costs money. So, in order to access the internet and the vast number of files out there, I’m going to have to pay for it. Maybe it’ll be through my taxes, and the government/libraries will provide computer terminals for me to borrow, or maybe I’ll pay a company like Bell or Rogers to string a cable to my house so I can access the internet from my own computer whenever I damned well please. One of the great things about the capitalist economy we live in is that if I want to be able to transfer more files/information per unit time, I can pay more for a more capable connection — the more I’m willing to pay, the more bandwidth they’ll sell me.

I’m coming at this rant a little obliquely, and I can tell I’m about to lose the last two readers I have, so let me jump ahead a bit in the train of thought.

I currently use Rogers as my ISP, and it is readily apparant that they are engaging in the worst form of bait-and-switch type business practices that are just a hair’s breadth away from all out customer buggery. I’ve had trouble with their customer support before, and some of their very odd decisions, but things are coming to a ludicrous level now.

In recent history, they’ve been curtailing the benefits of their high speed cable modem service by cutting out things not directly related to surfing the web (note: the web is a subset of the internet). Webpage hosting, a very basic level of service from virtually every ISP, was included with Rogers, at one point touting up to 20 MB of space so you could share photos and the like. Then, with virtually no warning* they deleted everyones webpage at members.rogers.com/username and asked that you set up a special geocities account. I was very worried about that, since geocities is an ad-supported free service (why send your paying customers to a free service?) that has had in the past some very questionable user agreements (at one point they claimed to own the copyright on anything they hosted). While the special rogers version didn’t have any ads, the address contained your full email address! Admittedly, it’s not exactly rocket science to figure out someone’s email address from the old pages, but it wasn’t there for every mindless spamtrollbot to find and add to its repetoire, and furthermore, the URL didn’t contain the @ symbol, which can cause all manner of headaches.

* – to be honest, there was technically warning, but it was not what I would consider fair. A very spammish-looking email was sent out after the Rogers-Yahoo merger deal, saying simply that there would be changes to webhosting, with a link to their transition page, which further had a link explaining that there would be changes coming in 6 months time. No further reminders were sent.

Overall, the webhosting situation was very poorly handled and in my mind a very bad idea. If they were merging with Yahoo/Geocities and wanted to use the geocities servers to actually serve the webpages, why not set up rogers.com/user aliases for all of them? While it’s not as spiffy as a custom domain name, a rogers.com/companyname webpage is something commonly used by a number of small businesses, many of whom would not bother to read 3 links deep in the Rogers/Yahoo advertisement to learn about the changes. They were completely sideswipped, having to change business cards, scramble for a new host (since a geocities site commands zero respect in business), and search for the backups of the webpage since it was deleted from the old server.

Meanwhile, they introduced their “Extreme” level of service, featuring a new $100 modem that you had to buy yourself, but which provided a higher throughput rate (advertised up to 5 Mbps downloads, versus the 3 Mbps downloads of the previous Rogers service). At the time, I was having some trouble with my cable modem, probably due to the large influx of students into my building/subnet which was causing me to get less than 1/3 of my advertised bandwidth. The tech recommended that I buy the new modem, which ran on a separate subnet and shouldn’t be prone to those sorts of problems. Were I a more argumentative person, I probably would have demanded that they give me the modem (I pay Rogers over $75 a month for TV and internet, surely with 3 years of me as a customer on my own, plus decades at my parents’ house, it would be a worthwhile investment). At any rate, I bought it, and things were better (I was getting half the advertised “extreme” bandwidth, bringing me up to the regular cable modem rates).

However, Rogers doesn’t want to actually give up that much bandwidth, so they introduced a new “feature”: transfer caps. There is a set amount you’re allowed to download each month (60 GB), after which you get he-bitch man-slapped down to dial-up speeds until the next month. This was to keep the “excessive” (“extreme?!”) users from choking the whole network with their always-on connections. To be fair, it’s not a bad limit: even I haven’t hit it yet, and I’d consider myself to be a big (but not excessive/abusive) user of my internet access. But it is still somewhat restrictive: you could hit it in just over 26 hours if you were able to continuously harness the maximum capacity of your modem. Perhaps the fact that fewer people aren’t cut off speaks volumes about the difference between advertised bandwidth and actual.

So, that’s fine. I don’t like it, but I can see it from their point of view and let it slide. Then they axe newsgroups. Ok, whatever. I do use them, but not that often. It was a basic level of service granted by all ISPs and part of what I pay for, and now they just tell me to go get a Giganews account for $15/mo or whatever. No discount on my bill or anything. That part makes me kind of mad, since it really seems like a bait-and-switch.

But the latest insult is even worse: they’re restricting what you can do with your bandwidth. Sometime between August and now, and I’m not sure exactly when, they introduced bandwidth limiting measures above and beyond the 60 GB limit. These target specific programs and sources of files, and is causing certain users a ton of headaches (including me!). Worse yet they never told anyone that limits were going in. At least with the overall 60 GB cap and the usenet cancellation, there was notice about it. So far, there’s been no official confirmation that throttling (“packet shaping”) is in use. And it’s being used very stupidly: I can understand throttling it so I can’t use the raw brute force of my extreme connection for more than short bursts of time; but leave me with enough bandwidth to make some kind of progress. Cutting it down to below dial-up so that I’m looking at month long completion times is way over the top. 100 kB/s I would be quite happy with. Even 25 kB/s I might write off as being a vaguely decent connection, and possibly the fault of the other side. But peaking at 5 kB/s, and averaging a mere 1 kB/s is simply not acceptable. It obviously points to a problem, which I tracked back to Rogers and their throttling.

It seems to target bittorrent and other programs selectively (some users have reported troubles downloading from iTunes). This is, simply, absurd. Bittorrent (and iTunes!) have some very legitimate uses.

If you’re not familiar with Bittorrent, let me tell you about its miraculous nature. To download a large file from a server is straightforward: you connect to the server, it uploads, you download. The server must pay for a lot of bandwidth to transfer those large files to many users. However, if those large files are time-sensitive, such as game patches (everyone wants to get patched up the day the patch comes out!) or movie trailers (ooh! new HD Xmen3 trailer!) those server-client paradigm servers can get hit hard. They have to pay for really really big “data pipes” so that they can handle the peak demand on release day — bandwidth which may go to waste the rest of the month.

So an ingenius system was created. If 100 users want a file, instead of sending it to all of them simultaneously, the server will instead send pieces to them. So user 1 will get piece 1, while user 2 gets piece 2. Now, instead of user 1 then getting piece 2 from the server, he trades with user 2, and they get their missing pieces from each other. This happens with all the users, and in the end the server only needed to send one complete copy out, and it was just shared back and forth amongst the users (in practice, it’s more than that, but still significant savings over the old model). The beauty of the system is that each user here tends to have some upload capacity which they were paying for but not using, so the bandwidth savings are essentially free. It also spreads the load out more, which can be healthier for the internet architechture as a whole (though it also creates more overhead).

Bittorrent is used mostly for large files, particularly ones where there isn’t the money to host a high-bandwidth server. Think Linux distributions, game patches, amateur movies and the like. Yes, the program itself doesn’t care what is sent, so you could just as easily send a stolen Hollywood movie as you could send a copy of your digitized vacation footage for your extended family. But the point is that Rogers should not be limiting users ability to use the program. I pay for the bandwidth, and I expect to use it.

Some helpful things I’ve learned while researching this:

  1. Rogers is running its software on almost all ports, so using non-standard bittorent ports will not help you.
  2. Rogers has reserved the ports related to “permitted” web activities so that, at the moment, they are not being throttled.
  3. XBox Live and VOIP are considered “premitted” uses, look up the ports to use Bittorrent over those until Rogers kills them, too.
  4. This is being done on a neighbourhood-by-neighbourhood and IP-by-IP basis. Just because you haven’t seen the problem yet doesn’t mean you won’t soon.

Each of these restrictions to my interent access, combined with a price tag that only goes up leaves me with a very bad taste in my mouth. So why, you ask, don’t I switch? Well, with the most recent changes to limiting downloads via Bittorrent and peer-to-peer (under 5 kB/s now; that’s worse than dial-up, man) and my extremely bad packet loss situation in WoW, I’m considering it. Before those hit, there simply wasn’t another option.

The faults in dial-up are obvious. It consumes a phone line and it’s slow. If you pay for a separate phone line, it’s not even necessarily cheaper.

(A)DSL is more subtle. Notice the A that is often dropped in DSL: it stands for “asynchronous”. That means that your upload and download speeds are different. While that’s true for most connections (my cable modem is capable of about 10X more download than upload), it’s particularly troublesome for DSL, since the upload is so very low: often the same as dial-up. That can pose trouble for certain applications (voice chat, games, or a download that tries to send you a lot of small packets which would require sending a ton of ACK packets back up). DSL is, in short, not the gamer’s choice. On top of that, the peak down speeds are often lower. While you don’t suffer from the evening rush as cable modem users do, since you don’t partition your bandwidth with the rest of the neighbourhood, it also means you can’t set a big download to run overnight, since you’ll do no better then. Since I’m mostly nocturnal, that’s not a big drawback to cable for me. DSL depends a great deal on the quality of your phone lines. Despite the fact that we get telemarketers calling us every month to try to push it on us, the wiring in this building is simply not up to the task. Once I took out one of our phone jacks and the wires behind the wall were all corroded. I stripped them back to some fresh metal to try to improve the connection (I think it helped a bit, but the phones are still a bit fuzzy here — who knows what shape the main box in the basement is in?). Also, the cost of DSL is usually just about the same as for cable, which makes it even more of a no-brainer for me.

However, if I’m not actually able to pull files any faster than DSL anyway due to throttling, then I might consider switching if/when I move.

One final thing: I ran across a neat feature while troubleshooting my problems (I swear to you, 4 hours fiddling with UPnP and my router and posting on forums and all kinds of garbage only to find Rogers had throttled me). It turns out many cable modems let you access them to check your noise levels yourself. Look at http://192.168.100.1/ and see if yours does. (My levels are currently 39 dB SNR, 2 dB power level down, 43 dBmV up. I’m still not quite sure what they mean or what they should be).

The question on my mind is: what do I do about it? They say there’s a way to check how close to the global 60 GB limit you’re getting, but I couldn’t find it under my account management page. Why they need to kill specific programs in addition to the global limit baffles me. I can’t call tech support and say “Hey, you nooblars throttled my bittorrent ports! Now I have to use port XXXX to get my latest linux distro!” since firstly, that’ll just clue them in to limit more ports (even if I don’t tell them which, or even that I’ve found a work-around), and secondly, the guys who answer the phone are for the most part poorly paid cue-card reading monkies who don’t deserve the full force of my wrath (hi Bug!). Is anyone familiar with the exact laws/protections regarding bait-and-switch, changing services provided, or anything else? I think a paper letter might help most for a general case, though a phone call demonstrating my legitimate need to get unthrottled might help me in particular more. Complaining to the Better Business Bureau might help, but from what I understand they want you to try to work it out with the company in question first, with a paper trail to show that they’re complete ‘tards before they give them a black mark… which doesn’t seem to amount to much, anyway.

Some things I haven’t had time to read yet:
Rogers’ end user agreement.
Rogers’ acceptable use policy.

Update: Found this on Rogers’ site regarding the Usenet discontinuation.

Usenet discontinuation: If Rogers is discontinuing a service, shouldn’t I get a discount on my bill?
Answer
There was no charge for this service. Rogers has introduced many new high-value services free for Rogers Yahoo customers. For example, RY Photos with unlimited storage, commercial-free Internet radio (Launchcast), a special Rogers Yahoo browser with premium features such as tabbed browsing, free premium personal web space and free blogs. As the Internet changes, it is reasonable to expect that new services will displace older. On balance, the total package for Rogers Yahoo customers continues to improve in both scope and depth.

I have to call bullshit. It may not have been a line-item on our bills, but that doesn’t mean it was free. It was definitely something they factored into their decision to charge $48.10/mo instead of any other number. Giving us ad-free versions of crap Yahoo already offers everyone isn’t adding much value. For starters, I doubt it costs them much if Yahoo can offer it based soley on the support of ads. And secondly, their free stuff looks and feels free. The Yahoo 360 blogs are really terrible — you’re better off going with Blogger.

Also, I found the usage measurement tool they speak of. Turns out there are two completely separate account management areas in Rogers. One is from the rogers/yahoo portal where you can add additional email subaccounts and the like. The other is from the main Rogers page, which allows you to view your account management and billing details. I’m using just under half the 60 GB limit (based on half a month’s use).