Kevin and I just posted to the blog about automatically flagging comments on Stack Overflow with The Unfriendly Robot. We also talked about the robot recently here.

After you give it a read we're interested in your thoughts/feedback and would like to answer any questions you might have. Thanks! 🙏

EDIT

Some questions are difficult to answer in the comments so I'm going to reproduce them here and answer them.

From jcsahnwaldt says GoFundMonica

From 2009 until mid-2018, comments could be flagged as "offensive". Happened for ~0.1% of all comments. Now we can flag comments as "unfriendly or unkind". When was the new flag introduced? (I guess in mid-2018?) How many comments are flagged as "unfriendly or unkind"? In your study with moderators and other users, the median person classified ~3.5% as "unfriendly". What was the average "unfriendly" flag percentage in that study? Did you try to measure how much of the difference between ~0.1% and ~3.5% is due to "unfriendly" being a broader criterion than "offensive"?

I agree with you that the definitions of offensive and unfriendly are different and that (by truthiness of gut) the frequency of unfriendliness should be higher than the frequency of offensiveness. We introduced the "unfriendly or unkind" flag in August 2018 in conjunction with the Code of Conduct change, and we kept the "offensive" flag around. What happened to the percentage of posts flagged before and after the change?

This a monthly box plot of the percentage of posts flagged each day, by flag type. In red is the offensive flag which corresponds to the box plot in the blog post. In blue is the unfriendly-or-unkind flag. When the unfriendly flag was released we see offensive flag usage drop way way down. There was copy change here too so "offensive" isn't what users see anymore, they see this...

Regardless, I think the graph shows the right thing, that what's truly offensive is way lower than what we used to think (if just going by the name of the flag). Unfriendly-or-unkind flag usage jumps up for a few months, and then settles back down to where the old offensive flag was. This is support for the hypothesis that the Stack Overflow system has an underflagging problem. Even when we expanded the definition of what you should be flagged, overall throughput wasn't meaningfully different. I don't think this result gives us any ability to answer "How much of the difference is from the broader definition?", though.

From jcsahnwaldt says GoFundMonica

Latest Images

Trending Articles

Latest Images