Traveling Happy Truck Drivers

Starting out as an explanation of the traveling salesman problem, this article goes on to also be an excellent, and I think understandable, explanation of what an algorithm is and computational complexity. If you want to get a quick sense of what computer science is concerned about that goes beyond just “how to program”, and is more the “how to solve problems” side of things, this is a good read.

In 2006, for example, an optimal tour was produced by a team led by Cook for a 85,900-city tour. It did not, of course, given the computing constraints mentioned above, involve checking each route individually. “There is no hope to actually list all the road trips between New York and Los Angeles,” he says. Instead, almost all of the computation went into proving that there is no tour shorter than the one his team found. In essence, there is an answer, but there is not a solution. “By solution,” writes Cook, “we mean an algorithm, that is a step-by-step recipe for producing an optimal tour for any example we may now throw at it.”

The second half of the article also has some nice details about how the messy ways that people behave make the route planning problem for a company like UPS a lot more complicated than the pure traveling salesman problem:

People are also emotional, and it turns out an unhappy truck driver can be trouble. Modern routing models incorporate whether a truck driver is happy or not—something he may not know about himself. For example, one major trucking company that declined to be named does “predictive analysis” on when drivers are at greater risk of being involved in a crash. Not only does the company have information on how the truck is being driven—speeding, hard-braking events, rapid lane changes—but on the life of the driver.

This loops back to talking about (informally again) algorithms, solutions, optimization, and the idea of a heuristic approach. I’m wondering if this would also be a nice way to illustrate the idea of modeling – I’ve found that it’s a phrase we use a lot that it’s easy to nod and say “sure, a model”, but things can get confusing as you start to slip between our informal, day-to-day usage of the word model and a more technical or formal meaning. This might help bridge the gap, particularly as it starts to touch on the idea of good versus bad models.

Too many options

I’m finding a lot interesting to think about in this discussion of the Guided Pathways to Success conference and it’s investigation of the benefit to students of guidance/constraints in their educational paths: “Schwartz emphasized that even though it may seem counterintuitive and even paternalistic, students are actually much more empowered by choosing among fewer and more carefully constructed options.”

My first thoughts are about the curriculum we just instituted, which I have thought of as giving students more flexibility and choice about how they put together sets of courses to complete a major or minor. We try to make clear to students that they do not need to start with completing our list of “core” courses – that the electives are just as central to the major, many of them can also be at the introductory/no-prerequisite level, and may even be more interesting or compelling to them than the core, depending on their interests. Thinking about the core, though, as a delineated path that counteracts the excess of choice when looking at the entire catalog, makes this behavior both make sense, and makes me more comfortable with that choice of how to approach the major somehow. I think I’ve been able to mentally shift how I see our curriculum to being one that gives those students who want to have lots of freedom and choice that option, but does spell out some clear paths for students who prefer that as well.

Thinking more broadly, this also relates to some thoughts I’ve been having about MOOCs and initiatives to try to allow studdents to assemble degrees piecemeal out of courses from many institutions of many different types. The implicit question behind those initiatives is, with free or near-free education available on-line, what is gained by a more traditional school. One answer seems to be exactly this structure and advising, particularly highly personalized advising, which is essentially a collaborative narrowing of choices with the student.

Digressing a bit from the original point, I also worry a bit about what is missing from a student’s overall education when education is constructed in such a piecemeal fashion. For my own program, I think about how we teach ethics. It’s not unusual to tackle this by spreading ethics instruction out amongst several courses, teaching it alongside more technical content, as compared to having a single, designated ethics course – this is the approach we take. Obviously, I think it is a good choice – students see ethics from many perspectives and throughout their time in the program, and they see it integrated with their other activities in the field. If a student is assembling a degree, though, from a set of courses at many different institutions, I have a hard time seeing how content can be spread throughout a curriculum in this way. In theory, if every course labeled every piece of learning content with the number of hours allocated to it in the course, a system could be constructed to ensure that all boxes were checked to a sufficient degree. But this feels unwieldy, and I suspect the more tractable approach would be to fall back on mandating courses covering any required content areas (perhaps permitting for half/quarter courses to make up the slack).

Of course they put E.T. in New Mexico

The news that the landfill of Atari’s E.T. games is going to be excavated swept through the internet. This makes me doubly excited that I still have my copy and I’m considering using it an an anchor point for a “play bad games” day in my intro game design course in the fall. Particularly having also found this really cool review of the games flaws and fixes for them. It starts from the position that the game is actually fairly good, and even groundbreaking, except for a few flaws or misunderstandings about the game (such as, that it is an easy kids game to pick up, as compared to a highly challenging quest-based game) that need to be addressed. Plus you get a nice detailed explanation of why E.T. keeps falling in the darn wells all the time, and an inside look at the problem of modifying space-restricted code (there is lots of talk of finding 12 bytes here and 9 bytes there to sneak in the desired changes).

Using MOOCs to raise the bar

A recent article about how MOOCs might, in fact, increase and not decrease costs on college campuses has been getting a fair bit of attention for its argument that the large lecture classes that it replaces were already the cost-saving venues of higher education and many of the proposals for integrating MOOCs well involve replacing these cost-efficient large classes with free MOOCs and then expensive associated mentoring. Additionally, it observes that even if a college doesn’t choose to incorporate MOOCs, the fact that they exist may make students less tolerant of paying tuition for large lecture courses.

The quote that jumped out at me, though, came in the middle of this argument:

The large lecture class is efficient, with a low per-student cost as the expense of the instructor resource is spread across so many students. Every institution of higher learning would love to only have small classes, but the economics simply don’t work. Faculty are too expensive. The large lecture class subsidizes everything else.

Except, unless I’m interpreting the definition of “large” and “small” incorrectly here, I’m pretty sure that I went to a college with only small classes, and currently teach at a college with only small classes. The article’s assertion that MOOCs will lead to a shift where undergraduate education must be personal and interactive is at the same time a suggestion that there will be a strong and possibly growing market for the small, liberal arts college experience.

Of course, that requires a lot of education about what that experience really is, and how much it differs from what many people assume a college experience must be. And, as the article notes, it isn’t cheap – certainly not as cheap as MOOCs can be. But it seems that perhaps the national discussion about MOOCs can be a real opportunity to communicate those differences.

Data Vis Roundup

I’m supervising a capstone project right now where students are providing data analysis and visualization support for a local organization, and the following set of links have been queueing up in my feed as to-read items for me related to that project (and, hopefully, to-read items for them):

eagereyes has a nice summary of ISOTYPE (International System of Typographic Picture Education) which in the roughest strokes is those charts where the number of an item is represented not by a bar but as a collection of images or icons representing the thing being counted. But it’s a lot more complicated than that, including guidelines about the design of the icons.

The same site also has a nice, short illustration of how visualizing something makes it real in a way that just seeing/reading the source information or data does not. To me, this highlights the importance of being thoughtful about what you are presenting and the accuracy of the analysis behind your visualization.

In a similar vein, I enjoy Junk Charts dissections of poor data presentation; this critique of bubble charts via a self-sufficiency test analysis is a nice example and serves as a good model for a simple way to assess your own visualizations.

If you’re thinking about data visualization to monitor something (which my students are), you probably need to think about if a “dashboard” would be helpful. juice analytics has put together a collection of innovative dashboard designs, and also have a helpful link to a white paper they wrote on dashboard design at the top of the article.

We’re also probably working with some maps in our project, so this list of common problems in maps from cartonerd could be useful, as could the linked collection of UK maps which is recommended as a tool for looking for ideas of what not to do. The key focus here is on persuasiveness of the maps; the overall message is ultimately narrowed down to the question “is the map as it stands capable of properly supporting policy-making”?

On the tool front, the announcement that students can now download and use the full desktop version of Tableau for free got a fair bit of fanfare recently.

My colleagues and I teach a number of courses where we hope students will go out and find their own data sets to work with – Quandl looks like it will be a great source to share with them. It’s a searchable collection of free and open datasets, normalized into a standardized format which can then be output in a range of useful formats (right now, Excel, CSV, JSON, XML or R). I love that you can browse the data online and even see some basic graphs of it without downloading, though. This is a site that some will use to look at data and get answers directly, not just a repository to download from.

Discourses

I’m helping organize a panel of faculty at my school who have been using a range of different technologies to support student interaction in and out of class. With so many options out there, we want to focus on what has worked for us, what hasn’t worked, and start some conversations around how to make the jump from looking at your course, with its content, outcomes, and pedagogy, and draw on others experience with these tools on the ground with our systems and our students to have some idea what options might be appropriate. Independently of this, I’ve got a group of students in a capstone using Basecamp to organize their project, and it’s got me thinking about (not a very new thought, I’ll admit), whether I like the idea of using this type of professionally-oriented project site in other courses to have students manage their groups. I get a lot less control than in a CMS, but the flip side is I’m running into fewer places where students are trying to make the site work for them and they don’t have enough power. SIGCSE just had a discussion in its mailing list about what repository systems people are using, and in all levels of courses. And when I see articles about new tools for collaboration and discussion like the new Discourse discussion platform, I’m immediately thinking about whether the improvements they talk about (less pagination, more dynamic processes for replying, flexible content embedding, and moderation/ranking tools) would work well for class discussion also. It reminds me of the window of time when I was in school and it was normal, if not expected, for a department to have a set of forums/groups associated with, and not just on a course by course basis. Is that still out there and I’m simply at an outlier school without them, or has that type of conversation been killed off? Is it (*shudder*) on Facebook?

Sometimes you can blame the compiler. Sort of.

I don’t know if this weblog entry about bug hunting in large scale game development is more appropriate for my spring games course or my spring project management course. The stories are great for both directions. Team members with poorly defined roles! Frantic timelines leading to bugs! The reality of entire days lost to a bug that won’t be found, let alone fixed! Bugs explained in simple code a novice student can understand! I particularly enjoyed the explanation of why the live server compiler ran without debug capabilities, violating the ideal that the dev and live servers are identically configured, to ensure that debug features used to test the games by forcing benefits, monster spawn, etc. weren’t leaked into the live system – if there were never sensible reasons for breaking what seem like obvious rules, they wouldn’t get broken.

And there is an interesting lesson about customer satisfaction in the story of embedding code in a game to identify hardware failures and then not only detecting when bug reports are actually related to customer hardware failures but proactively telling customers when they are having hardware issues before problems crop up so they can do something about it. The explanation of why this ultimately saves them time by avoiding hard to resolve bugs that are really due to hardware faults makes sense, but also reflects an interesting decision about how much responsibility to take for the entire game playing experience, whether portions of that are actually one’s responsibility or not.

Risks in user content

My security class is talking about the types of commonly seen mistakes that can crop up when writing programs that lead to security flaws, and while I usually introduce the ideas using “normal” programming examples because it is the common background I can assume my students have, I’m trying to help the students map these ideas to what they’ve seen of database or web development as well. So I finally went back in my saved links and read through a Google blog post from a month ago about security issues in hosting user content, specifically web content.

After a brief but reasonably nice survey of the problem they’re trying to address, they include this interesting statement, contrasting the current state of affairs to the old days of hosting static HTML: “For a while, we focused on content sanitization as a possible workaround – but in many cases, we found it to be insufficient. For example, Aleksandr Dobkin managed to construct a purely alphanumeric Flash applet, and in our internal work the Google security team created images that can be forced to include a particular plaintext string in their body, after being scrubbed and recoded in a deterministic way.”

I’ve been trying to make the argument to our digital media (as I get opportunities to talk to them) that they really ought to think of security as a good elective to round out their major, particularly those focusing on courses in web development and mobile application development. I’m sorely tempted to print out a copy of this article and go over and paste it on their lab door – or at least remind their professor to do another advising push towards the course on its next offering. These problems are perhaps outside the scope of what most web developers would encounter, but I wonder if rejecting the importance of understanding these issues would be analogous to an application developer believing that only someone working in operating system design really has to understand security.

What are you getting credit for?

A colleague sent me an article about a U.S. university accepting transfer credit for a Udacity course – something described in the headline and the first few paragraphs as being a breakthrough in a school accepting a free, online course for full transfer credit.

The article gets interested when you dig into it though. The course in question is a intro level “Introduction to Computer Science” course. And, in order to get the transfer credit, students have to not only get a certificate of completion from Udacity showing that they completed the course, but also pass an exam administered at a testing center, for a cost of $89. Which, happens to be the exact same price as taking the AP CS exam, which you can take, by the way, even if you aren’t signed up for an AP class (with, I think, some hoops to jump through). And, I’m going to bet that the transfer credit received is the same transfer credit students would get if they took the AP exam and did well on it.

So, the more accurate portrayal of the story is, I think, that there is a university that has decided that they will now allows students to get transfer credit either for taking an AP exam or for taking one of these tests run by Pearson VUE (which, when you look into it, is a massive testing operation already). And, the university accepting the transfer credit is applying it only for their fully online program, not their programs with physical campuses.

Moving into the realm of wild speculation now, even if this practice becomes widespread, it seems like it will turn into more of a threat for the AP/CollegeBoard than for colleges which already allow students to get transfer credit for some introductory courses. With there still being fees and the need to show up at a testing center, I can see this broadening access to college-level placement tests. But (again, wild guessing), it seems like colleges will have less to lose with some number of freshmen who would not otherwise come in with transfer credits now having a handful. Whereas, as a high school student highly focused on your GPA, you have an interesting choice between whether you take an AP course which may get counted as more important on your transcript but may also result in a lower grade, or do you take an easier course, get a better grade, and then take an online course to supplement and then take an Udacity exam for college credit? Factor in that the AP is a one-shot deal, and it looks like you can retake the Udacity courses (more like the SATs) and there’s an interesting question of which is more appealing.

Robots run amok

Interesting story of the life webcast of the Hugo Awards being blocked by copyright enforcement bots. Short version: the live webcast included clips of the television episodes up for best script (as award ceremonies do) and UStream’s bots for detecting copyrighted work spotted it and blocked the entire rest of the broadcast. The article points out that not only is that fair use but, the clips were provided by the copyright holders who were happy the content was being promoted as award winning.

The whole thing is reminiscent of NASA’s footage of the Curiosity landing being removed from NASA’s YouTube channel under the claim that it violated Scripps News Service’s copyright on the material. The problem being that Scripps uploaded NASA’s video to their own stream and, accidentally they say, marked it as being their own content. It ought to jump out at you that, whether Scripps made an honest mistake here or not, there’s plenty of potential for someone to fraudulently claim ownership of content and harass the legitimate owner or reap profits from the content with so little evidence required. Figuring out that a live feed of a NASA rover and the NASA control room during a highly publicized NASA mission actually does belong to NASA has to be one of the easier cases to get right…

The common thread being the automatic disabling or removal of content without solid evidence that infringement is happening or, clearly, human review. It also sounds like, from the Hugo Awards case, there isn’t anybody standing by on call to reverse these actions if errors are made. Add this to the list of things to worry about with both digital intellectual property management and what happens when you start moving to the cloud.