fuzzy notepad

[blog] A new use for StackOverflow

It’s hard to get a feel for a new tool. Is it any good? Does it do anything I can’t already do? What’s the community like? Tough questions to answer without diving in and using it for a significant amount of time—and then you risk not liking the answers you get.

But fear not! I have discovered a new and brilliant way to discern the novel features of a tool, the vibrance of its community, and its range of users all at once. In mere minutes.

Look at its ten highest-voted questions on StackOverflow.

I’m totally serious. Watch.

Python

The list.

The first three ask about how to use generators, metaclasses, and decorators—probably Python’s three neatest metaprogrammingish features.

Number 4 asks about running Python on Android, a common question that hints at Python’s popularity as a dynamic Java alternative.

Number 5 is about the equivalent of enum, which is a pretty common question (and garnered 35 answers, wow) about how to structure your program.

6, 7, and 8 are about checking for a file’s existence, becoming an expert in Python, and running an external command. Seems there are people who jumped to Python from shell scripting, and want to know how to use it more seriously.

9 is about the ternary operator, which was new at the time (and which is unusual enough that most newcomers don’t know it’s there).

10 is, um, Peak detection in a 2D array. Clearly some people are doing some cool number crunching and visualization with Python.

So what can we take from this?

  • New Python developers are interesting in becoming proficient;
  • Python has some novel features that developers are interested in understanding;
  • Python appeals to sysadmins, app developers, and scientific computing.

Sounds pretty accurate to me. Let’s try something else.

PHP

The list.

Question 1 is about preventing SQL injection. Appropriately, question 10 is about which of the solutions to use.

Number 2 is about whether to use DATETIME or TIMESTAMP in MySQL. No, don’t worry, you didn’t miss anything; this actually has nothing to do with PHP whatsoever.

3 is a massive syntax reference. I’ve actually never seen a meta-question like this on SO before.

4 asks how to parse HTML. 7 asks about long polling, though the ultimate answer is more about JavaScript and Apache.

5, 8, and 9 are about how to store passwords, how to use bcrypt for passwords, and how to hash passwords.

These are substantively different types of questions.

  • PHP is used overwhelmingly for Web development, and commonly with MySQL.
  • PHP developers are confused by its syntax, and the documentation isn’t sufficiently helpful.
  • Four of these questions are about security issues. You might take this to mean that PHP developers are security-conscious… or you might take it to mean that a lot of PHP code has security issues and nobody knows how to fix them. The interpretation is up to you, but do note that most StackOverflow questions are asked reactively.

It’s kind of hard to see what problems PHP is commonly used to solve; the only question about solving a particular problem in PHP asks how to parse HTML, and the answers are just “use one of these ten libraries”.

But PHP is aimed at the Web, so naturally it would be tied to a bunch of Web questions. I wonder what people ask about my pet Web framework?

Pyramid

The list. Note that these questions have far fewer upvotes than the top questions for PHP or Python, which makes them less likely to be statistically significant.

The first two ask about Pyramid vs Pylons and whether Pyramid is production-ready.

3 asks about output formats. 4 asks about user auth. 5 asks about form libraries. 10 asks about templating engines.

6 is a sort of code review request for a file upload implementation. The asker also asks if there are any unobvious vulnerabilities, and indeed the lone answer points one out.

7 asks about gzip compression, which doesn’t really have anything to do with Pyramid, but the top answer finds a solution anyway. 9 asks a strange, sparsely-detailed question about sqlalchemy that again has nothing to do with Pyramid.

8 asks how to debug Pylons apps with Eclipse. Neat.

These don’t really look like the PHP questions, either.

  • Early adopters wanted to know whether Pyramid is stable yet. I expect this would happen with most technologies newer than StackOverflow; the oldest, and most relevant at the time, questions will be about what it can do and whether to use it.
  • Pyramid users are interested in its builtin web development tools (templating, etc.) and how to use them.
  • Along the same lines, Pyramid users want to use their fancy-pants debugging IDE with it.
  • At least this one guy is interested in security issues he has not yet predicted. This is very different from asking about how to prevent a vulnerability you know only by name.
  • Apparently, web developers in general can’t tell where their framework ends and other pieces begin.

This is fascinating, but time-consuming, so I’ll only do one more. I’m curious to see…

Rust

The list. Again, these questions have very few upvotes, since Rust is a new and unfinished thing. Let’s look anyway.

1 asks how Erlang compares to Rust. 3 asks if anyone has used Rust at all, and 4 wants some examples of Rust projects.

2 asks about typestate. 6 is confused about what “monomorphization” is, in either Rust or C++.

5 is about ranges, 7 is about accessing enum fields, and 9 features abuses of pattern matching. 10 wants to know how to use sockets.

8 reveals a weird cargo error.

So.

  • Rust is new. Surprise!
  • Rust is getting people interested in type system theory, which is cool. The typestate answer explains the concept in fantastic detail, as well as hinting at why the feature was effectively removed from Rust several releases ago.
  • Rust users are not clear on how to use some of its features. This isn’t surprising, since Rust deliberately bucks some trends, but it does point to some potential deficiencies in the tutorial.

End

Okay, maybe this isn’t scientifically rigorous. Upvotes don’t have a precise meaning, and top questions will tend to stay at the top, and older questions have a bias, and genuine problems with a tool may have been fixed since the question was asked, and so forth.

But since upvotes are all about people, the top questions can tell you what other people think a technology is about, what they’re doing with it, and what problems they’re experiencing. Maybe give it a shot next time you’re thinking about trying out a new language, or deciding between two libraries.

Remember, there are no stupid questions! Only stupid software.

If you like when I write words, you can fund future wordsmithing (and other endeavors) by throwing a couple bucks at my Patreon!

(illus. by Rumwik)

Comments