COMMAND LINE: ./consensus-v6c -f gal4.cons -L 8 ***** PID: 21785 ***** L-mer Width: 8 Minimum distance between starting points of words: not relevant Save the top alignments derived from each intermediate alignment Maximum number of matrices to save between cycles: 1000 Status of complementary sequence: IGNORE. Algorithim options: one match per sequence. Stop only when the maximum number of cycles is reached. The number of matrices to print. Top Matrices saved from each cycle: 4 Matrices Saved from the last cycle: 4 ***** Sequence information from file "gal4.cons". ***** sequence 1: >YBR018C fragments: 1-500 sequence 2: >YBR019C fragments: 1-500 sequence 3: >YBR020W fragments: 1-600 sequence 4: >YLR081W fragments: 1-500 sequence 5: >YML051W fragments: 1-500 sequence 6: >YOR120W fragments: 1-500 Total number of sequences: 6. Total number of sequence fragments: 6. #**** Information on observed frequency and occurrence of each letter. ****# #Total number of letters in the input sequences = 3100 A 0.312903; observed occurrence = 970 (letter 1) C 0.200323; observed occurrence = 621 (letter 2) G 0.193548; observed occurrence = 600 (letter 3) T 0.293226; observed occurrence = 909 (letter 4) PRIOR FREQUENCIES DETERMINED BY OBSERVED FREQUENCIES. ***** Information for the alphabet from file "alphabet". ***** letter 1: A (complement: T) prior frequency = 0.312903 letter 2: C (complement: G) prior frequency = 0.200323 letter 3: G (complement: C) prior frequency = 0.193548 letter 4: T (complement: A) prior frequency = 0.293226 INFORMATION CONTENT IS CALCULATED USING NATURAL LOGARITHMS (i.e. BASE e). DIVIDE BY ln(2) = 0.693 TO CONVERT TO BASE 2, WHICH WAS USED IN PREVIOUS VERSIONS OF THIS PROGRAM. [] MATRICES SAVED FOR NEXT CYCLE [] []------------------------------------------------------[] [] total | top adjusted | ln top [] ln expected [] CYCLE [] number | information | p-value [] frequency [] ------[]----------|--------------|-------------[]-------------[] 1 [] 3058 | 2.1623 | 0.0000 [] 8.0230 [] 2 [] 726 | 5.8438 | -18.9337 [] -3.7631 [] 3 [] 835 | 6.9034 | -25.3021 [] -3.6125 [] 4 [] 812 | 7.0959 | -29.8303 [] -2.1970 [] 5 [] 857 | 7.2924 | -35.7440 [] -2.7958 [] 6 [] 895 | 7.3512 | -41.4052 [] -4.0175 [] INFORMATION CONTENT IS CALCULATED USING NATURAL LOGARITHMS (i.e. BASE e). DIVIDE BY ln(2) = 0.693 TO CONVERT TO BASE 2, WHICH WAS USED IN PREVIOUS VERSIONS OF THIS PROGRAM. THE LIST OF TOP MATRICES FROM EACH CYCLE--sorted by expected frequency (total of 5): MATRIX 1 number of sequences = 6 unadjusted information = 9.78142 sample size adjusted information = 7.35115 ln(p-value) = -41.4052 p-value = 1.04218E-18 ln(expected frequency) = -4.01747 expected frequency = 0.0179984 A | 0 0 0 2 0 0 5 0 C | 6 0 1 2 0 6 0 3 G | 0 6 5 2 6 0 1 3 T | 0 0 0 0 0 0 0 0 1|3 : 1/222 CGGAGCAC 2|1 : 2/168 CGGAGCAG 3|6 : 3/251 CGCCGCAC 4|5 : 4/168 CGGGGCGG 5|4 : 5/329 CGGCGCAC 6|2 : 6/132 CGGGGCAG MATRIX 2 number of sequences = 2 unadjusted information = 12.6543 sample size adjusted information = 5.84381 ln(p-value) = -18.9337 p-value = 5.98671E-09 ln(expected frequency) = -3.76309 expected frequency = 0.0232118 A | 0 0 0 0 0 0 0 0 C | 0 0 0 0 0 0 2 2 G | 2 0 2 2 2 2 0 0 T | 0 2 0 0 0 0 0 0 1|2 : 2/97 GTGGGGCC 2|1 : 5/95 GTGGGGCC MATRIX 3 number of sequences = 3 unadjusted information = 11.7564 sample size adjusted information = 6.90341 ln(p-value) = -25.3021 p-value = 1.02667E-11 ln(expected frequency) = -3.61251 expected frequency = 0.0269841 A | 0 3 0 0 0 0 0 0 C | 2 0 3 3 0 0 3 0 G | 0 0 0 0 3 3 0 3 T | 1 0 0 0 0 0 0 0 1|2 : 4/82 CACCGGCG 2|1 : 5/326 TACCGGCG 3|3 : 6/118 CACCGGCG MATRIX 4 number of sequences = 5 unadjusted information = 10.2435 sample size adjusted information = 7.29244 ln(p-value) = -35.744 p-value = 2.99616E-16 ln(expected frequency) = -2.79582 expected frequency = 0.0610648 A | 0 0 0 2 0 0 4 0 C | 5 0 0 1 0 5 0 2 G | 0 5 5 2 5 0 1 3 T | 0 0 0 0 0 0 0 0 1|3 : 1/222 CGGAGCAC 2|1 : 2/168 CGGAGCAG 3|5 : 4/168 CGGGGCGG 4|4 : 5/329 CGGCGCAC 5|2 : 6/132 CGGGGCAG INFORMATION CONTENT IS CALCULATED USING NATURAL LOGARITHMS (i.e. BASE e). DIVIDE BY ln(2) = 0.693 TO CONVERT TO BASE 2, WHICH WAS USED IN PREVIOUS VERSIONS OF THIS PROGRAM. THE LIST OF MATRICES FROM FINAL CYCLE--sorted by information (total of 895): MATRIX 1 number of sequences = 6 unadjusted information = 9.78142 sample size adjusted information = 7.35115 ln(p-value) = -41.4052 p-value = 1.04218E-18 ln(expected frequency) = -4.01747 expected frequency = 0.0179984 A | 0 0 0 2 0 0 5 0 C | 6 0 1 2 0 6 0 3 G | 0 6 5 2 6 0 1 3 T | 0 0 0 0 0 0 0 0 1|3 : 1/222 CGGAGCAC 2|1 : 2/168 CGGAGCAG 3|6 : 3/251 CGCCGCAC 4|5 : 4/168 CGGGGCGG 5|4 : 5/329 CGGCGCAC 6|2 : 6/132 CGGGGCAG MATRIX 2 number of sequences = 6 unadjusted information = 9.75092 sample size adjusted information = 7.32065 ln(p-value) = -41.1478 p-value = 1.34809E-18 ln(expected frequency) = -3.7601 expected frequency = 0.0232813 A | 0 0 0 3 0 0 4 0 C | 6 1 0 1 0 6 0 2 G | 0 5 6 2 6 0 2 4 T | 0 0 0 0 0 0 0 0 1|3 : 1/222 CGGAGCAC 2|1 : 2/168 CGGAGCAG 3|6 : 3/162 CCGAGCGG 4|5 : 4/168 CGGGGCGG 5|4 : 5/329 CGGCGCAC 6|2 : 6/132 CGGGGCAG MATRIX 3 number of sequences = 6 unadjusted information = 9.69009 sample size adjusted information = 7.25982 ln(p-value) = -40.6366 p-value = 2.24782E-18 ln(expected frequency) = -3.24883 expected frequency = 0.0388196 A | 0 0 0 3 0 0 6 0 C | 6 0 1 2 0 6 0 4 G | 0 6 5 1 5 0 0 2 T | 0 0 0 0 1 0 0 0 1|1 : 1/222 CGGAGCAC 2|4 : 2/168 CGGAGCAG 3|3 : 3/251 CGCCGCAC 4|6 : 4/173 CGGATCAC 5|2 : 5/329 CGGCGCAC 6|5 : 6/132 CGGGGCAG MATRIX 4 number of sequences = 6 unadjusted information = 9.65776 sample size adjusted information = 7.22749 ln(p-value) = -40.3659 p-value = 2.94643E-18 ln(expected frequency) = -2.9782 expected frequency = 0.0508846 A | 0 0 0 3 0 1 6 0 C | 6 0 1 3 0 5 0 5 G | 0 6 5 0 5 0 0 1 T | 0 0 0 0 1 0 0 0 1|1 : 1/222 CGGAGCAC 2|5 : 2/168 CGGAGCAG 3|3 : 3/251 CGCCGCAC 4|6 : 4/173 CGGATCAC 5|2 : 5/329 CGGCGCAC 6|4 : 6/121 CGGCGAAC