Determine If a Challenge is Worth Answering

05AB1E, 167 160 159 158 156 154 143 bytes

Damn, almost as long as a normal language...

Crap... longer currently beating the the Ruby answer by 1 byte.

Now longer than the Ruby answer, argh!.

I probably should go to bed right now.

Thanks to @wnnmaw for saving 1 byte and thanks to @R. Kap for saving another 2 bytes!

Code:

’£Ø ˆå§¾.‡¢ as g;#.¾¿„–(g.ˆåƒÛ('·Ç://ÐÏg.´¢/q/'+•œ_#()).‚Ø())’.e©’„à="Ž»"’DU¢®…ƒŠ‡¡`99£þs\®X¡¦vy’„à="‚¬-„¹"’¡¦'>¡¦¦¬l’±¸’¢s\}rUV)O2‹X5›Y3›)P

Or with more readability:

’£Ø ˆå§¾.‡¢ as g;#.¾¿„–(g.ˆåƒÛ('·Ç://ÐÏg.´¢/q/'+•œ_#()).‚Ø())’
 .e©
’„à="Ž»"’
 DU¢®
“ƒŠ‡“
 ¡`99£þs\®X¡¦
v
 y’„à="‚¬-„¹"’¡¦'>¡¦¦¬l’±¸’¢s\}rUV)O2‹X5›Y3›)P

Explanation:

First of all, a lot of text is being compressed here, which translates to good old Python. The uncompressed version is:

"import urllib.request as g
 f=g.urlopen('http://ppcg.lol/q/'+pop_#())
 #.append(f.read())"
.e©“class="answer"“¢®"useful and clear"¡`99£þs\®“class="answer"“¡¦vy“class="post-text"“¡¦'>¡¦¦¬l"python"¢s\}rUV)O2‹X5›Y3›)P

This part:

import urllib.request as g
stack.append(g.urlopen('http://ppcg.lol/q/'+pop_stack()).read())`

actually pops a stack value, copies it into the url and fetches all HTML data. The HTML data is pushed on top of the stack using #.append(f.read()).

We count the number of answers, by count the number of occurences of class="answer".

To count the number of votes, we just split the data at "useful and clear" and keep only the digit values of [0:99] using ®"useful and clear"¡`99£þ. This is the number of upvotes.

Eventually, we need to check every answer if the text "Python" exists before the closing header text. To get all answers, we just split the data on class="post-text" and split each of them again on <. We remove the first two elements to get the part in which the language is displayed and check if the lowercase version is in this string.

So, now our stack looks like this for id = 79273:

`[6, '14', 0, 0, 0, 1, 0, 0]`
  │    │   └───────┬──────┘
  │    │           │
  │    │   is python answer?
  │    │
  │    └── number of upvotes
  │
  └─── number of answers

This can also be seen with the -debug flag on in the interpreter.

So, it's just a matter of processing the data:

rUV)O2‹X5›Y3›)P

r                # Reverse the stack
 U               # Pop the number of answers value and store into X
  V              # Pop the number of upvotes value and store into Y
   )O            # Wrap everything together and sum it all up
     2‹          # Check if smaller than 2
       X5›       # Push X and check if greater than 5
          Y3›    # Push Y and check if greater than 3
             )P  # Wrap everything into an array and take the product.
                   This results into 1 if and only if all values are 1 (and not 0).

Uses CP-1252 encoding. You can download the interpreter here.


Python 3.5, 280 272 260 242 240 bytes:

(Thanks to Adnan for the trick about using the * operator in comparisons resulting in 2 saved bytes!)

def g(o):import urllib.request as u,re;R=re.findall;w=bytes.decode(u.urlopen('http://ppcg.lol/q/'+o).read());print((len(R('(?:<h[0-9]>|<p>).*python',w.lower()))<2)*(int(R('(?<="vote-count-post ">)[0-9]+',w)[0])>3)*w.count('answercell">')>5)

Simple enough. Uses Python's built in urllib library to go to the question's site, and then uses regular expressions to find the vote count, answer count, and the count of Python specific answers in the decoded text returned from the website. Finally, these values are compared against the conditions required to return a truthy value, and if they satisfy all the conditions, then True is returned. Otherwise False is.

The only thing I may be worried about here is that the regular expressions give a lot of lee way in the terms of the number of python specific answers to save bytes, so it may be a bit inaccurate at times, though it's probably good enough for the purposes of this challenge. However, if you want a much more accurate one, I have added one below, although its longer than the one above. The one shown below is currently 298 bytes since it uses a much longer regular expression–one which you couldn't know how long it took me to discover–for counting Python answers than my original function for the sake of accuracy. This one should work for about at least 80% to 90% of all test cases thrown at it.

def g(o):import urllib.request as u,re;R=re.findall;w=bytes.decode(u.urlopen('http://ppcg.lol/q/'+o).read());print(len(R('(?<=answercell">).*?(?:<h[0-9]>|<strong>)[^\n]*python[^\n]*(?=</h[0-9]>|</strong>)',w.lower()))<2and int(R('(?<="vote-count-post ">)[0-9]+',w)[0])>3and w.count('answercell">')>5)

But, what about those questions with multiple pages of answers? Neither of the above will work very well in that situation, if, say, 1 python answer is on the first page and another is on the second. Well, I took the liberty to fix this issue by creating another version of my function (shown below) that checks every page of answers, if multiple ones exist, for Python answers, and it has done quite well on many of the test cases I have thrown at it. Well, without further ado, here is the new and updated function:

def g(o):
 import urllib.request as u,re;R=re.findall;w=bytes.decode(u.urlopen('http://ppcg.lol/q/'+o).read());t=0if len(re.findall('="go to page ([0-9]+)">',w))<1else max([int(i)for i in re.findall('="go to page ([0-9]+)">',w)])
 if t<1:print(len(R('(?<=answercell">).*?(?:<h[0-9]>|<strong>)[^\n]*python[^\n]*(?=</h[0-9]>|</strong>)',w.lower(),re.DOTALL))<2and int(R('(?<="vote-count-post ">)[0-9]+',w)[0])>3and w.count('answercell">')>5)
 else:
  P=[];U=[];K=[]
  for i in range(2,t+2):P.append(len(R('(?<=answercell">).*?(?:<h[0-9]>|<strong>)[^\n]*python[^\n]*(?=</h[0-9]>|</strong>)',w.lower(),re.DOTALL)));U.append(int(R('(?<="vote-count-post ">)[0-9]+',w)[0]));K.append(w.count('answercell">'));w=bytes.decode(u.urlopen('http://ppcg.lol/questions/'+o+'/?page='+str(i)).read())
  print(sum(P)<2and U[0]>3and sum(K)>5);print('# Python answers: ',sum(P));print('# Votes: ',U[0]);print('# Answers: ',sum(K))

Quite long, isn't it? I wasn't really going much for code golf with this, although, if you want, I can golf it down a bit more. Otherwise, I love it, and could not be happier. Oh, I almost forgot, as an added bonus, this also outputs the total number of Python answers on the question, total votes on the question, and total number of answers on the question if the question id corresponds to a question with more than 1 page of answers. Otherwise, if the question only consists of a single page of answers, it just outputs the truthy/falsy value. I really did get a bit carried away with this challenge.

These each take the question's id in the form of a string.

I would put Try It Online! links here for each function, but unfortunately, neither repl.it nor Ideone allow fetching of resources via Python's urllib library.


Ruby + HTTParty, 170 146 145 142 139 138 + 11 ( -rhttparty flag) = 181 157 156 153 150 149 bytes

I don't think there's any edge cases that would cause my regex patterns to break, I hope...

Updated to the shortlink provided by @WashingtonGuedes and discovering that HTTParty doesn't complain if I start with // instead of http://.

Updated for slightly more secure regexes. I saved bytes anyways by discovering that HTTParty response objects inherit from String, which means I don't even need to use .body when matching the regex!

@manatwork pointed out an accidental character addition that I had left in, and for the sake of golf, i has to be accepted as a String now.

Updated regexes. Same length. -1 byte by cutting out a paren.

->i{/"up.*?(\d+)/=~s=HTTParty.get("//ppcg.lol/q/"+i)
$1.to_i>3&&(a=s.scan /st.*xt".*\n(.*)/).size>5&&a[1..-1].count{|e|e[0]=~/python/i}<2}

Extra notes:

  • The first line of an answer (which should contain the language according to the spec) is two lines after the HTML Tag with class "post-text", which we matched with st.*xt". A more secure version would have added a space after it, but we're sacrificing that for the sake of golf.
  • HTTParty is used over the native net/http modules because of proper redirect handling for the given URL.
  • "up*?\d was the shortest sequence I found that corresponded with the number of votes. We only need the first one, so thankfully answers don't affect this.