Computing desk
< May 7	<< Apr \| May \| Jun >>	Current desk >

Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

May 8

How do you write a Python function to read a file and only print the words in it with 20+ characters?

How do you write a Python function to read a file and only print the words in it with 20+ characters? Futurist110 (talk) 01:47, 8 May 2019 (UTC)[reply]

Review the official Python reference for standard data types (including text strings).

If you're totally lost, read the tutorial.

Nimur (talk) 02:01, 8 May 2019 (UTC)[reply]

1) Write a python program that reads a file and prints the words from it, one per line (that means figure out how to split the file into words). 2) write a function that given a word, figures out if it is 20+ chars. 3) Modify the program from 1) to print out only those words. For 1), see the .split() operation on strings. 67.164.113.165 (talk) 19:19, 8 May 2019 (UTC)[reply]

Questions that exercise programming skill are welcomed at https://codegolf.stackexchange.com/. Here is a suggestion.

Decide on a format of text file. A simple choice can be .txt file created by Windows programs such as the simple Notepad editor.
Observe the character coding, particularly which codes are alphanumeric characters. Supposing common UTF-8 or ASCII 8-bit coding (and not attempting Unicode) we shall treat hexadecimal ranges 0x30..39, 40..5A and 61..7A as alphanumeric characters and every other byte value as a non-word character.
Use Notepad to save any TEST.TXT that contains words of various lengths.
Construct a Python program to implement this Pseudocode:

	Declare a 20-character string $[0..19]
        Declare unsignedinteger counter n
        Declare unsignedinteger continueflag kf
	Open file TEST.TXT for reading in binary
        Open file OUT.TXT for writing text
        n=0
        kf=0
	while not at end of TEST.TXT
        	while n<19
          		$[n] = read byte from TEST.TXT
          		if kf=0 then n=n++
          		REM reset counter if we get a word separator
          		if ( n < 30 ) or ( (n > 5A) and (n < 61) ) or ( n > 7A ) then n=0_
				if kf=1 then output to OUT.TXT 0x0D,0x0A_
                                             kf=0
				n = 0
          		if kf=1 then output to OUT.TXT $[n]
        	wend
        	REM word of 20+ characters found
        	for n=0..19
          		output to OUT.TXT $[n]
        	kf=1
        	n=0
	wend
        Close files
        Display "Done."
        End

DroneB (talk) 20:48, 8 May 2019 (UTC)[reply]

Ugh. DroneB, sorry, but the pseudocode above clearly shows you are not familiar with Python. There is no variable declaration (not even type, let alone string length) in Python (duck typing). Reading text files byte-by-byte as binary is not a standard pattern either, and finding "words" by checking the byte values is just awful (I suspect that one is not specific to Python). (The IP above gave the correct advice to look at the .split() method of strings, though one might want to go full regular expressions to handle some edge cases.)

Also, codegolf.stackexchange is certainly not a place to ask semi-homework questions as the above. (One should not practice code golf before knowing at least a long solution to the problem!) Tigraan^{Click here to contact me} 15:59, 9 May 2019 (UTC)[reply]

We are assuming the OP doesn't mean a function with 20+ characters and does mean words with 20+ characters. Such words are rare in English where a fluent 3000-word vocabulary may contain no word longer than 13 characters ("grandchildren"). We do not know whether the OP is searching for such rarities ("Antidisestablishmentarianism") or for constructed passwords and/or in a non-English language.

Here is an on-line word statistics site.

The Python compiler offers a built-in method

str.split([sep[, maxsplit]]) Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits.

and the operation

string.split(s[, sep[, maxsplit]]) Return a list of the words of the string s. If the optional second argument sep is absent, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed).

We shall not provide the OP with a Python script tested and debugged for the OP's requirement. Hopefully the references given may have already resolved the question. If not, the OP needs to look into what the program does at the byte and character level. My Pseudocode is a deliberately language-agnostic suggestion of what to look for i.e. an environment-independent description of the key principles of an algorithm, and it is by no means a how-to write Python or itself an executable or compilable script. DroneB (talk) 15:00, 11 May 2019 (UTC)[reply]

Can't login into TurboTax/Intuit

I can't log into my TurboTax account. I can't tell if it's me or a problem with their website. Can someone who has a TurboTax try to log in and see it's working for them? A Quest For Knowledge (talk) 18:24, 8 May 2019 (UTC)[reply]

Looking at my console window, I'm getting the several CORS errors:

Access to XMLHttpRequest at 'https://prod-services.myturbotax.intuit.com/services/clientLog' from origin 'https://myturbotax.intuit.com' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: The value of the 'Access-Control-Allow-Origin' header in the response must not be the wildcard '*' when the request's credentials mode is 'include'. The credentials mode of requests initiated by the XMLHttpRequest is controlled by the withCredentials attribute.

I'm not sure if that's something to do with my work computer. I'll try to log in when I get home. A Quest For Knowledge (talk) 18:28, 8 May 2019 (UTC)[reply]

Hmmm...I can login using Firefox but not Chrome. At this point, I'm thinking there might be something wrong with my computer. A Quest For Knowledge (talk) 18:33, 8 May 2019 (UTC)[reply]

Its not your computer, per say; its more of a security issue. See: Cross-origin resource sharing (CORS). Basically, it needs to know if you are authorized to access your account. Presumably you set up the account on Firefox, and when Chrome attempted access, it inserted a Wildcard character when Turbotax requested credentials for verification -- which simply means that it couldn't prove that you are the same person who set up the account ('origin' didn't match}. For the gory details, try reading XMLHttpRequest (yeah, right) —2606:A000:1126:28D:30C6:6408:4CE8:2DAC (talk) 05:46, 9 May 2019 (UTC)[reply]

I was able to log into TurboTax on Chrome on my home computer. I think I have some sort of misconfiguration or weird setting on my work laptop's version of Chrome. In any case, I was able to login. A Quest For Knowledge (talk) 13:21, 9 May 2019 (UTC)[reply]

From the first page on https://www.intuit.com, check the certificate (click on the lock, view the certificate). Is it signed by INTUIT INC? When I go to it from my work computer, it is signed by GHSAUTH. That is because the company computers send all https requests to IT where it is set up to do a "man in the middle" attack and decrypt all of the secure traffic. That causes problems with wildcard certificates because the IT service has to properly supply wildcard certificates when it fakes the certificate. 12.207.168.3 (talk) 14:45, 9 May 2019 (UTC)[reply]

Wikipedia:Reference desk/Archives/Computing/2019 May 8

Contents

May 8

How do you write a Python function to read a file and only print the words in it with 20+ characters?

Can't login into TurboTax/Intuit