300 entries!

Raleigh, North Carolina

Well, everyone, it's been nice sharing this time with you on a homemade blog for three hundred weeks. My blog continues on Wordpress

So far, 0 comments. Add a Comment
Urban Legends of the State Fair

Raleigh, North Carolina

I was in the office when my team leader Rob told me there was going to be a famous guy - Stephen Wolfram - at the fair. He said that Wolfram was going to be identifying peoples' regions of origin within North Carolina based on their accents. When I arrived at the fair I looked for him with my friend James. When I spoke with the woman at the information about Wolfram's exhibit, she said she had no idea what I was talking about, but it might be in the Expo Center if she had to guess.

It wasn't in the Expo Center. As I stood disappointed beside some giant pumpkins, James suggested another building he knew of at the fair. Down the halls we roamed. Eventually I ran across a booth with dialects of North Carolina plastered all over the walls. I didn't see any bombastic famous types there, but I figured he would be too busy and have his assistants run the booth, so I strode forward and started talking, encouraging the two booth people to identify my dialect.

The poor, befuddled booth people explained to me that it's very hard to identify someone's region by their dialect alone, and said that they had never claimed they could do any such thing. They had also never heard of Stephen Wolfram. Nevertheless, I had worked too hard not to be identified, so I insisted that they at least try. The young woman guessed I was from within an hour of Raleigh, which was right, and was actually rather easy, since I'm well known for having not much of a North Carolina accent at all, which is appropriate for the triangle area where people from all over the world gather.

Then James said, "Yeah, I'm from upstate New York, but can you guess which part?"

The young woman guessed upstate, and James later insisted that he had not already told her the answer no matter how much I tried to convince him otherwise.

All in all it was a good time, but I'm still going to have a talk with Rob when I see him tomorrow.

One more thing - if you want to be part of deciding stuff about what the next iteration of the blog will be like, I'll be soliciting ideas in the comments. First question: What should the new blog be named?

So far, 4 comments. Add a Comment

Raleigh, North Carolina

I've spent the last week writing a five page paper. Single spaced. Double-column. 9-pt font.

The length isn't even the issue, in fact it's been a bit of a struggle to make it as short as it is; I've actually made the format more dense, changing from tiny spaces between paragraphs to indentations and removing all the spacing between the references. Even still, my entire system's architecture diagram is stuffed onto one tiny column, covered in bubbles just big enough for the font to be legible and with short, stubby arrows filling the tiny spaces between them. The caption on this architecture diagram reads "certain modules omitted for simplicity." I started this paper as an outline on Monday. It's due Monday. Specifically, it's due Monday at 11:59 (UTC-11), which means Tuesday at 6:59AM in Raleigh. Fortunately, I have two brilliant professors, both co-PIs on my project, who are helping me author this thing.

I work on it for a while and turn it in to one of them, who sends it back to me with to-dos. An awful lot of these to-dos are formatting issues, like removing the space between paragraphs. I was lucky enough to know how to use a reference manager to automatically handle my references because one of my tasks was to "change references to be alphabetically ordered." That means all of the little numbered citation boxes need to be updated, too, then I would need to check them to make sure they're pointing correctly. My advisor specifically asked me to do this. For my own personal health I will not imagine doing it by hand, as so many other academic authors apparently have done.

The paper just keeps getting better, though. Most people who look at it have been impressed, even this morning when the paper was almost entirely different. I'm thinking that after much reviewing and many iterations the explanation of my system may be understandable to people who aren't me, but I'll give it another week before I dump it onto my blog.

Speaking of my blog, it's getting pretty old.

Whoah, whoah, no, that's not what I mean. I have no intention of shutting down. I mean this hand-made php xml stuff that I've been using since 2008, while perfectly serviceable, has weaknesses. Probably first and foremost is images. I can't tell you the number of times that I have had beautiful pictures to share with you but never got around to actually uploading them because whatever system I tried was just too onerous to keep up with regularly. What's more, I've never been able to do quite what other blogs can do with their images. I can't seamlessly toss an image directly into my text to illustrate a particular point, which I would love to start doing. Probably even more important than that is that, let's be honest, this blog is a little ugly. At the right screen resolution everything fits right and looks great, but on mobile devices and on larger screens things go rapidly downhill. I'm actually embarrassed to even let people know that I have a blog, it's so ugly. None of these are dealbreakers. The best part about a hand-made site is that I can fix any problem with it I want with some time and effort. The killing blow is that I just don't have that time, and I have a million other more important things on which to spend my precious effort.

So why do I keep it?

I've looked into moving my blog database into a new format, it's not as insurmountable as it initially seemed. I won't lose any of you when I move because I can make www.shodor.org/~smunk/blog automatically redirect to wherever my new site ends up. The big sticker has been comments. For the longest time I've had my own personal low bar of entry solution to letting people post comments on my site. Identify an animal - that's it. None of these big blogs have picked up on this solution and all demand logins or at best confusing, annoying captchas that will deter people unfamiliar with them and less than fully determined to leave me a comment. Comments are my favorite part about this blog, so I've been reluctant to add any additional bars to entry on the comment front and that's why my blog has stayed the way it is for so long. It can't stay that way forever, though, and now that I've noticed my writing and subject matter slowly climbing in quality it's past time the software got better, too. I can't say when exactly it'll happen, but The Blog Formerly Known as Sam's Japan Blog is due for an upgrade.

Geez, that title's due for an upgrade, too.

So far, 7 comments. Add a Comment
Technique X

Raleigh, North Carolina

My written qualifier is coming along nicely. It's reached the stage where it's beginning to look like an actual paper, which is pretty gratifying. One small problem - I still don't like the name for my technique.

It all started when I found out that Thomas Landauer - the creator of the essay grading technique off of which mine is loosely based - had named his technique "Direct Prediction." This may or may not seem like a stupid name to you, but it did to me the first time I saw it. In fact, the reason he calls his technique "Direct Prediction" is because at the time that he invented it the state of the art involved researchers coming up with measures that they thought defined the quality of an essay and then using those measures to automatically assign a grade. Landauer's technique on the other hand finds the human-graded essays most similar according to a given similarity metric to the essay to be graded and assigns a grade based on the grades of those essays. So to distinguish his novel approach from the standard of the time, he said he was making a "direct prediction" from previous human scores. Now that statistical techniques in natural language processing are commonplace, the term "direct prediction" is no longer meaningful. Landauer's technique might be better referred to as an aggregation of the grading precedents set by the human graders, or "precedent aggregation." I was so happy when my professor appeared to agree to my analysis and let me change the name of Landauer's technique.

Then I tried to call my technique precedent aggregation and there was a problem. My technique is different from Landauer's not only because it grades short answers instead of essays but also in that it collects many different subsets of the pre-graded answers and calculates various statistics on the similarities of the answers in each set to the student's answer. These statistics it sends to a machine learning algorithm that generates a system to assign grades to answers. So, I can't call my system precedent aggregation because that would be giving the same term two different meanings. Precedent aggregation was the best term I had and I wasted it on some ten-year-old technique that already had a name.

Precedent Aggregation unavailable to me, I tried calling my technique Precedent - Reference Analytics (PRA) because it collects analytics on both the reference answer and the pre-graded precedents, but I had named my problem Constructed Response Analysis (CRA), and repeatedly saying "PRA" and "CRA" in a presentation was just asking for me to misspeak and for people to become terribly confused. Now I'm calling it Similarity Statistics (SimStat), which while it is an accurate description sounds overly simplistic and may lead people to think I haven't done any interesting work at all. That's the name I'm sticking with for the moment, though. We'll see if my professor asks me to change it again.

So far, 5 comments. Add a Comment

Raleigh, North Carolina

So, I have a new technique for self-improvement. Originally I called it "Waitercizing" because it was based on doing quick exercises while waiting for something, like code to run, but that was an incredibly stupid name. Now I've come up with a better name and a tweak on the strategy. Micro-Workouts are simple, quick exercises that focus on improving personal weaknesses. For example, I have good lower body health because of my bicycling (at least, I imagine I do), but I have no system in place that keeps my core and my upper body in shape. This is because I can't stand any exercise that doesn't involve a goal (e.g. getting from my home to my work). Here I say I can't stand any exercise, but the key to micro-workouts is that this is just a turn of phrase. Can't I stand any exercise, or can't I stand what is generally considered a "correct" amount of exercise? Thirty minutes of running is a workout. One jumping jack some might say is not a workout, but if one jumping jack is hard to do, then I argue it is a workout.

I can do one jumping jack. It's not hard for me. One pushup is not hard for me. Five pushups were a little hard for me, but not so hard that I dreaded doing them each day. So I did do five pushups a day, just whenever I thought of it and I was in a private place where I wouldn't embarrass myself. Now five pushups are easy for me and ten pushups is a little hard, but not hellishly so, so that is my daily goal. No need to push myself or strain or struggle. I'm not trying to achieve some meaningless goal, I'm just trying to be healthier. I don't need to repeat everything you've probably already heard about the benefits of exercise, so consider adding a micro-workout to your daily life.

So far, 2 comments. Add a Comment

Site Statistics:



Jump to Post: