------------------------------------------------------------------------------- Generating Random Passwords... Methods... * Hexadecimal passwords (including UUIDs) * Base64 passwords * Easier to Type - removing confusing characters (typing or printed) * Pronounceable Passwords * Random list of words (XKCD & 'Hay Stack' Principles) * Random phrase generation (discussion only at this point) NOTE use of unicode or emoji is not (yet) recommended for passwords. These can contain binary characters than not all systems can handle correctly. They basically are not going to work everywhere and me even break systems. They also are not as easy to remember, or record in a password manager. ------------------------------------------------------------------------------- Hexadecimal... uuidgen UUID's are sequences of hexadecimal and hyphens. A 128 bit value, or 3.4e38 posibilites, with hyphens added. Generated using hashing of a name space, basically a random number. HOWEVER: 2 digits are not so random! uuidgen (36 chars, with 4 hyphens) # Example Output: ff7739d0-b58f-404d-92c8-d9bf9be8ef22 # NOTE: ^ ^ # version ---' '--- varient #EG: version 4 = uuid generator version number # varient is tricy, and based on number of leading 1's in value # binary__hex____variant___ # 0xxx 0 - 7 reserved (NCS backward compatible) # 10xx 8 - b DCE 1.1, ISO/IEC 11578:1996 # 110x c - d reserved (Microsoft GUID) # 1110 e reserved (future use) # 1111 f invalid # So the above is DCE 1.1 variant Remove hyphens (32 hexadecimal chars) uuidgen | tr -d - # Example Output: 8428f2f9a81c4bd4950e847a6add1d99 md5sum, sha1sum, sha256sum,... These are hashing functions of some type of input data. Deterministic (repeatable) with same data input. EG: users name, current time, random value NOTE: adding or not adding spaces/newlines will also effect the outcome. Adding a date or other source of randomness to the input will randomise the data much more. Output Length (in hexadecimal characters): 32 md5sum 40 sha1sum shasum 56 sha224sum 96 sha384sum 128 sha512sum b2sum md5sum -- 128 bits (like uuid) # Generate 32 character hexadecimal (128 bits) echo $USER | md5sum | cut -f1 -d' ' # => f11ae3e79cc88dbe97b2e1da6ed97a57 # Convert into uuid format echo $USER | md5sum | sed 's/\([0-9a-f]\{4\}\)/\1 /g;' | awk '{print $1$2"-"$3"-"$4"-"$5"-"$6$7$8}' # => e4ea5477-492b-160a-8d7a-eac1fb16d107 Purely random openssl is probably the best way You specify the number of binary chars, then a conversion So the argument is 1/2 the number of digits wanted openssl rand -hex 16 # Example Output: 1abc8210752f3f8f674c40dbb0f72134 Using /dev/urandom in various ways Note: you may need to add a newline on end if needed # urandom - extract just the characters wanted tr ), "\n"' # OR using the very old "dd" command to get characters dd if=/dev/urandom bs=1 count=16 2>/dev/null | perl -0777 -e 'print unpack("H*",<>), "\n"' Select from an array Using "shuf" options: -z null delimiter, -e args are input, -r repeat picks shuf -zer -n32 {0..9} {a..f}; echo # Example Output: d53c1c3a55d239d2b76318348aa5d632 ------------------------------------------------------------------------------- Base64... openssl encrypt We must input enough data to generate the required number of characters. Output includes: numbers, uppercase, lowercase, '+' and '/'. As well as one or two '=' at end of the encrypted data (depending on length). And it can contain newlines that will need to be removed. The first 8 binary bytes of raw encrypted data is "Salted_:". So we need ignore the first 9 binary -> 12 base64 characters in output. It is not deterministic due to salting # Using encrypted data - skip first 12 base64 'magic' characters # first echo is the password, second is enough data to encrypt { echo $RANDOM; hostname --fqdn; } | openssl enc -aes-256-cbc -pass stdin -base64 | tr -d '\012' | cut -c13-36 # Example Output: ZJ89Yb8+Nf2O5VYDf0PRZe0H Deterministic for a specific machine (no salt, no magic) # Using encrypted data in a deterministic way (not random) # First echo is the password, rest is enough data to encrypt { echo $USER; hostname --fqdn; hostname --fqdn; } | openssl enc -aes-256-cbc -nosalt -pass stdin -base64 | tr -d '\012' | cut -c1-24 # Example Output: pZYK2X+STT5/zvFTQK4NzAyC Random Generated # generate 18 binary chars -> 24 base64 chars (3->4) openssl rand -base64 18 | cut -c1-24 # urandom converted (default width of "base64" is 76 chars) head -c18 /dev/urandom | base64 # shuf array of chars shuf -zer -n24 {0..9} {a..z} {A..Z} + \\ ; echo # Example Output: 9RN884MJAfkXBBWWWf3cCGGd WARNING: Base64 normally expands groups of 3 binary chars to 4 base64 chars. This means if the final group is short, 1 or 2 extra '=' is added to fill out the group and mark when a short group of binary characters was given. Examples showing results of short binary input head -c15 /dev/urandom | base64 # Example Output: R/wVas+AgAoHX546RYqd head -c16 /dev/urandom | base64 # Example Output: 8Jf9earondovewHVMyLzAQ== head -c17 /dev/urandom | base64 # Example Output: E0poUTMjb+Cm/Hfkce+JBNs= head -c18 /dev/urandom | base64 # Example Output: tDlBw7Xk19SbYpz6i+tkmQM9 ------------------------------------------------------------------------------- Cleaned Passwords That is removing characters that may cause confusion when reading. Many characters can look very simular to each other. For example: '1', 'l', 'I' and '0', 'O' and '8', 'B' To prevent such confusion these characters should be avoided, or replaced Whitespace can be very easily confused with tabs, and newlines and returns vary from platforms, and often end a password, so cannot be used in a password. The same goes for the extended Latin characters, such as ò Â ë ý ± · Some punctuation characters may, or may not be a good idea. For example sometimes '.' comes out so small, looks more like spot of dirt rather than a period on a printing page, and can be confused with a comma ','. Unicode has identical looking chaacters, for example dashes, different meanings based on usage... '−', '–', '—'. Similarly unicode can be impossible to type. As such Unicode is best avoided entirely. Some characters are hard to type on the tiny mobile phone keyboards. Even typing a ',' on a Apple iPhone keyboard can be a pain, and typing a period '.' can automatically add a space after it! One test of readability is to see if OCR (Optical Character Recognition), gets the characters right. For example: a 'f' in a serif font, is often mistaken for a 'P' by OCR software. So here are some basic techniques to 'clean' a password for reading/printing Base64 - Replace confusing characters with punctuation # openssl rand 18 binary -> 24 base64 chars openssl rand -base64 18 | tr '1lI0O8B' '-\\$*%=_@#^&' # Example Output: a/p5%b4fKwLi$Es\k9*CXJry Base64 - Replace confusing/hard to type characters with more normal ones openssl rand -base64 18 | cut -c1-24 | tr '0OQ1lI8B./+^#;' 'XYZabc67rst234' # Example Output: Y6f57TxbY29AEPsyNJSRgjG7 Base64 - Just remove ALL confusing characters and punctuation # openssl rand, generate lots, as some characters will be removed openssl rand -base64 36 | tr -d '\0121lI0O8B+/=' | cut -c1-24 # Example Output: fAkK5Foym5hTGMfvZPimMPE6 Random # Read a restricted list of random characters until you get enough. # The LC_ALL is to fix "tr A-Z" handling in some international locales. # LC_ALL=C tr \/ -><- And finally * extra padding symbols at start, end, or both * add multi-character pictogram <-> [*] ^-^ _-_ (O) \/ /^\ :::>=- -=<::: -=#=- Implementations hsxkpasswd Or my own passphrase_generator (see below) https://antofthy.gitlab.io/software/#passphrase_generator ----- Using words from a large dictionary... Using /usr/share/dict/words available on all UNIX systems but limiting it to words that are 3 to 9 characters long. Will produce a word list of about 193 K words, at the time of writing. # Select a random separation character # Note backslash needs extra backslash for use with 'paste' sep=$( shuf -en1 -- - _ + = \# \* : \| \~ ^ . , / \\\\ @ ! \? % \$ \& ) # four random words, reasonable sized, joined by separator egrep '^[a-z]{3,9}$' /usr/share/dict/words | shuf -n4 | paste -d"$sep" -s # Result Example: scalier*kickee*mest*provider Problem is the words from this dictionary are generally not very 'common'! Instead you get a lot of obscure words that are rarely used and difficult to spell. They are also more likely to be longer words rather than shorter ones by virtue of there being more longer words. ----- Using a dictionary of peoples names I found names to generate more memorable phrases. sep=$( shuf -en1 -- - _ + = \# \* : \| \~ ^ . , / \\\\ @ ! \? % \$ \& ) grep -v '^#' ~/lib/dict/names_male | shuf -n5 | sed 's/./\L\u&/' | paste -d"$sep" -s # Example: Bret+Ron+Malcolm+Hernando+Steven Uppercase letter placement can be varied. (first, second, third, last, all) ----- A generator script... Include almost ALL the Hay Stack points and principles https://antofthy.gitlab.io/software/#passphrase_generator Example Outputs: Portfolio,Avenge,Kilobyte,Safari ___~DICTUM~Chinese~decoder //fRightful|uNwitting|vEto|omen// croweD.trekkinG.:::>=-.lezzieS /-\-launch-SLIGHT-WISE __readable~brine~oft~Liberian__ parka-attest-charter-chalice==== Magician.Rotom.Gloater.Empire Prodigy=Sworn=Happily=+++ [reFined][prImal][faCedown][coNfer] Trial|Theft|Want|5893 brionne*1820*asteroid*tornadus -=#=-,shannon,arrokuda,thorny center.pacifism.census.___ antigua=SLAW=tuxedo={*} propane,dolly,7701,camper An option ('-c') is provided to generate simplier '4words' style pass-phrases, like the first one shown above. While an '-o' option forces compatibility with 'oldstyle' password policies, which is still in common use by may web servers. As such the option ensures the password will include: numbers, symbols, capitals For example: "passphrase_generator -c -o" generated... BENEFIT.gargle.1354.brewing Dentist_Wicket_3267_Bunkbed boYs=tiPsy=7667=grEater 5286|FOLLOW|expiring|triceps bRowse|6023|lIberia|aBroad Emblem-2936-Dewdrop-Pact --------------- Small dictionaries... For use when you don't have a system dictionary available such as in a docker instance. XKPassword, including proper nouns (1259 words) Web Site: https://metacpan.org/pod/Crypt::HSXKPasswd Its probaly a little too small to be useful Example: block_Korea_suffer_Mercury_Saturn Mnemonic Encoding Word List (1626 words, 4-7 letters) Web Site: https://gist.github.com/fogleman/c4a1f69f34c7e8a00da8 Yes this is a smaller than the Diceware Words (see next), and shares about a 1/3 of its words with that dictionary (624) giving an extra ~1000 common words * Words 4 to 7 characters, and not a prefix of another word * Words that sound unique and should be useable the world over. * No two words should sound simular * No offensive words. * No tricky spelling or pronunciation. * The word should feel like words used for radio phonetic alphabets Diceware EFF Dictionary (7776 words) https://theworld.com/~reinhold/diceware.html Small lists of words, if the dictionary size is a concern. Designed to allow people to roll dice to select words from the list. Such dictionaries should * Have enough terms to cover the full range of selection * Words should be short, but more than 3 or 4 characters each * Contain only ascii characters, uppercase as appropriate * No word should be a prefix of another word (EG zoo, and zoom) * contain no offending terms See also "diceware-list" https://pypi.org/project/diceware-list/ https://github.com/ulif/diceware-list Which provides python tools to generate new diceware lists. The EFF version was made later using plain common words, and is recommended. There are multiple versions of diceware word lists, many international, https://www.eff.org/deeplinks/2016/07/new-wordlists-random-passphrases A java script to roll the dice (in your browser not over network)... https://diceware.dmuth.org/ EFF Fan Inspired: StarTrek, StarWars, Harry Potter, Game of Thrones NB: Can contain some non-ascii chars, and list is only 4000 words. Lists are doubled up to cover the range, which is not good) https://eff.org/fandomdiceware The original diceware list by Arnold G. Reinhold contains a lot of non-words, numbers, punctuation, abbreviations, etc. https://theworld.com/~reinhold/diceware.wordlist.asc An alternative list by Alan Beale (of 12Dicts fame below) also contained many non-words and punctuated words. http://world.std.com/~reinhold/beale.wordlist.asc And a Criticism of the above two (basically what I mentioned) http://www.webplaces.com/passwords/diceware-criticism.htm Combined list to try to resolve these issues (lowercase only) http://www.webplaces.com/passwords/lists/Diceware-Combined-7776.txt http://www.webplaces.com/passwords/lists/Diceware-Improved-7776.txt Example: majesty_elm_clang_maimed_disbelief International lists from some diceware rolling programs... https://github.com/yesiamben/diceware/tree/master/js/lists https://github.com/grempe/diceware/tree/master/lists Word Lists for Secure Passphrases Web Site: http://www.webplaces.com/passwords/passphrase-word-lists.htm A collection of dictionaries for generating multi-word passwords from many sources, including... Scramble Wordlists separated by word length 10K most used words (also from XKCD) DiceWare : Original and Improved Lists Playing Card Passphrase (PCP) Generation (alternative to diceware) Anthony's Combined Word Lists Attempts to merge a dictionaries... Initial filter to: remove comments, and the formatting of '12dict', limit word sizes to a 4 to 9 character range, remove duplicates, and suffix/prefix duplicates and any word on the offensive wordlist (which is rather overkill) # NOT DONE: capatialise names from name word lists Final word list is about ~12k words long cd ~/lib/dict { grep '^#' dict_pwd_anthony ( sed '/^#/d; /[ .]/d; s/\/.*//; /..../!d;' dict_12dict_core tr '[:upper:]' '[:lower:]' dict_pwd_anthony.new The "wordlist_cleanup" program is a small perl script... * sorts list case-insenstive, * removing duplicates (capilized prefered to be kept) * removes words with common suffixes like plurals, * removed words that are a prefix to another word https://antofthy.gitlab.io/software/#wordlist_cleanup A short word wordlist Short words of 3-5 chars only, making a short list of ~4k words Prefix words and common suffixs are not removed The double sorting ensures the duplicate removal preserves capitalisations cd ~/lib/dict { grep '^#' dict_pwd_anthony_short ( sed '/^#/d; /[ .]/d; s/\/.*//; /..../!d;' dict_12dict_core tr '[:upper:]' '[:lower:]' dict_pwd_anthony_short.new --------------- Larger dictionaries... System "words" Dictionary... "/usr/share/dict/words" Available on almost all UNIX Systems. Current Linux version is almost 480K words. If limited to 3 to 9 characters, then becomes 193K words. Contains many obscure words, jargon, names, and places, which makes it less than ideal, but it is a HUGE list! egrep '^[a-z]{3,9}$' /usr/share/dict/words Yet Another Word List (YAWL) https://ibiblio.org/pub/Linux/libs/yawl-0.3.2.tar.gz Pure alphbetical words, all lowercase Includes very LONG words, like "antidisestablishmentarian" and even longer! Contains around 264K words If limited to 3 to 9 characters becomes 149.6K words Peter Norvig's Common Words from Newspapers Web Site: http://norvig.com/ngrams/ http://norvig.com/ngrams/count_1w.txt Again it contains a lot of obscure, long words, and Americanisms But he also provides many other words lists from same sources. Example: insulting^shahani^essonne^unoccupied 12Dicts by Alan Beale (version 6.0.2) Web Site: http://wordlist.aspell.net/12dicts/ Contains list of words for various purposes such as games, Developed by comparing the words of 12 dictionaries in which words appear in multiple dictionaries to remove jargon and obscure words. As such these dictionaries are smaller and contain more common words, than the previous large dictionaries. Started originally for generating PGP passphrases, Some lists devoted to words for: crossword puzzles, and scramble-like games. Dictionaries marked *** are probably what you are looking for. American Base Lists "6of12" medium list of common words and phrases (32K) "2of12" larger base list of words, no phrases (41K) "3esl" American list of very common words (22K) "2of12inf" expanded list with plurals & tenses, for use in games (82K) International Lists "2of4brif" British list (depreciated for 'game' list, next) (60K) "3of6game" International list for use in games (65K) ***** "5d+2a" Desktop dictionary, no phrases, similar to Linux "words" (68K) "3of6all" More words (hyphenated, names, phrases) (83K) Lemmatized (base word + closely related words) "2+2+3lem" very large list, alphabetic by head word (84K) "2+2+3frq" same list by ordered by frequency, exclude rare (34K) "2+2+3cmn" smaller list of very common usage (25K) ******** Special Lists "neol2016" new words not yet in general dictionaries (600) "2of5core" very small list words everyone should know (4690) *** "6phrase" long list of common phrases (22K) Example use... # Assumes a '#' comment header was added to the file. # clean up game words... ( 49K words ) sed -n '/^#/d; s/[^[:alpha:]//; /^.\{3,9\}$/p' ~/lib/dict/dict_12dict_game # lemmatized dictionary # extract words, ignore "he/she", keep hyphenated, and contractions "she'd" sed '/^#/d; s/^ *//; s/[*+]//g; s/, /\n/;' ~/lib/dict/dict_12dict_cmn | egrep '^[^/]{3,9}$' # 18K words # OR ignoring all lemmatized variations egrep '^[[:alpha:]-]{3,9}$' ~/lib/dict/dict_12dict_cmn # 15K words # Core word list, no phrases sed '/^#/d; / /d; s/\/.*//;' ~/lib/dict/dict_12dict_core # 4617 words SCOWL -- used for aspell and hunspell Web Site: https://github.com/en-wl/wordlist Includes an Alternative 12Dicts Package version 4, but in a different format ENABLE - Old Scrabble word list, that is VERY LARGE Original site is not online, but copies can be found via google. Moby Project - from Project Gutenberg Web Site: http://digital.library.upenn.edu/webbin/gutbook/lookup?num=3201 Includes common words 74K acronyms and abbreviations list 6K compound & hyphenated words 256K Game wordlists with all tenses 114K + 4K more for official scrabble Common Names 21K Place Names 10K Male Names 5K Female Names 4K Common Misspellings 366 Android Platform Dictionaries Web Site: https://android.googlesource.com/platform/packages/inputmethods/LatinIME/+/master/dictionaries Binary Dictionaries in multiple languages, with probabilities, and flags if a word is possibly offensive. Offensive Word Lists While you are dealing with dictionaries, you may also want to look at or maybe remove words that are: profanity, swear, curse, or insulting. The 'base' dictionary is not very big, but comprehensive. While the 'full' dictionary can include punctuation, numbers, character substitutions, etc. Both lists contain a few phrases (multiple words). https://www.freewebheaders.com/full-list-of-bad-words-banned-by-google/ One list which I found thoughly over-kill, in that many words I would not consider to be offensive unless part of a large context, is... https://www.cs.cmu.edu/~biglou/resources/bad-words.txt Cracklib Dictionaries These are databases used to PREVENT users from using 'common' words as password. But then no password should ever be JUST one word! Not all words are valid or pronouncable words, and also includes common keyboard sequences, such as "qwerty". As such this dictionary does not make a good source of words for passwords. install package: cracklib Dictionary sizes as of Oct 2021 cracklib-unpacker /usr/share/cracklib/cracklib-small | wc -l 54763 cracklib-unpacker /usr/share/cracklib/pw_dict | wc -l 1965391 ------------------------------------------------------------------------------- Randomly generated phrases, quotes, and sentences. Using a standard (even long) list of common and famous quotes is not recommended, as the black hats can and do compile similar lists of quotes. That is a list of well known quotes has a rather low in entropy. See... https://github.com/radicallyopensecurity/passphrase-cracking However one idea is to use different dictionaries, of word types to generate random psuedo-sentences, which are longer, but even easier to remember than a simpler random list of common words. One example, Basied on the "Furry Dice" language https://www.youtube.com/watch?v=bxpc9Pp5pZM BNF Notation... ::= ::= ::= ::= the | a ::= dog | cat | man | woman | robot | ... ::= bit | kicked | stroked | ... Random output example... The robot kicked a cat One example is used by "docker-ce" to generate a randon name for containers. It selects a adjective and a surname of famious scientists and hackers, to combine them. Its a little short for use a as password but you get the idea. See.. https://agarwalrounak.medium.com/default-container-names-in-docker-15bdbf56b539 A version where you can set number of words used/ https://github.com/jjmontesl/codenamize The lists of words should be very very long, and the language can be made a lot more complicated with added adjectives and tenses. This would likely need something like a neural network to generate phrases in a way that works properly (maybe "ChatGTP'?). Results should be checked against a list of common quotes. Of course numbers, symbols, pictogram, and padding should also be added (See the 'Hay Stack' principle, above) Example Generators... http://watchout4snakes.com/wo4snakes/Random/RandomSentence https://phrasegenerator.com -------------------------------------------------------------------------------