Post History
Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below. Relevant fact This chal...
#5: Post edited
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- - You may choose to take input in lower case instead, provided that you also use lower case in your output for an input that is not a valid Roman number.
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - Since the consistent value can be anything (and specifically does not need to contain Roman numerals), there is no requirement for it to be upper or lower case.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- - If you chose to take input in lower case, then this output string must also be in lower case.
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ### Upper case test cases
- These reflect the case used in the rest of the challenge wording, although there is no requirement to use upper case for this challenge.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ### Lower case test cases
- These are the same test cases in lower case, in case you can benefit from taking lower case input.
Note that if you take lower case input then you must also give lower case output.- ```text
- i : valid
- v : valid
- x : valid
- l : valid
- c : valid
- d : valid
- m : valid
- ii : valid
- vv : vv
- xx : valid
- ll : ll
- cc : valid
- dd : dd
- mm : valid
- iii : valid
- vii : valid
- ivi : ivi
- iiv : iiv
- cci : valid
- ccv : valid
- ccx : valid
- ccl : valid
- ccc : valid
- ccd : ccd
- ccm : ccm
- iiii : iiii
- mldi : ld
- mxxc : xxc
- dciix : iix
- mcxxxx : xxxx
- mccccxvi : cccc
- mmlxcvii : lxc
- mmmcmxcix : valid
- mmmdccclxxxviii : valid
- mmcccxlv : valid
- mcmxcvi : valid
- mmxdiii : xd
- mmcdxciii : valid
- mmccmdxxv : ccm, cmd
- xxx : valid
- cllx : ll
- dxxdmmv : dm, xd
- ccdddimdd : dd, im, ccd
- vlcxivxmcvxlc : vx, vl, lc, xm
- dvliilvcxvxvmli : vx, vl, il, vm, vc
- vvdlmivilxxdx : vv, vd, il, xd, lm, ivi
- dmxxcmilvcmllmv : dm, il, ll, vc, lm, xxc, xcm
- cmxdvlccddlxlxc : dd, vl, lc, xd, xlx, lxc, ccd, lxl
- xdixcllmvvlcmcm : vv, vl, lc, xd, ll, lm, ixc, cmc, xcl
- iixxcdvvlmilvdd : dd, vv, vl, vd, il, lm, xxc, ixx, iix, xcd
- ddmiixxcmccdcmm : dd, dm, cmc, cmm, xxc, ixx, dcm, cdc, xcm, ccd, iix
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- - You may choose to take input in lower case instead, provided that you also use lower case in your output for an input that is not a valid Roman number.
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - Since the consistent value can be anything (and specifically does not need to contain Roman numerals), there is no requirement for it to be upper or lower case.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- - If you chose to take input in lower case, then this output string must also be in lower case.
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ### Upper case test cases
- These reflect the case used in the rest of the challenge wording, although there is no requirement to use upper case for this challenge.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ### Lower case test cases
- These are the same test cases in lower case, in case you can benefit from taking lower case input.
- Note that if you take lower case input then you must also give lower case output where it is one of the 50 strings.
- ```text
- i : valid
- v : valid
- x : valid
- l : valid
- c : valid
- d : valid
- m : valid
- ii : valid
- vv : vv
- xx : valid
- ll : ll
- cc : valid
- dd : dd
- mm : valid
- iii : valid
- vii : valid
- ivi : ivi
- iiv : iiv
- cci : valid
- ccv : valid
- ccx : valid
- ccl : valid
- ccc : valid
- ccd : ccd
- ccm : ccm
- iiii : iiii
- mldi : ld
- mxxc : xxc
- dciix : iix
- mcxxxx : xxxx
- mccccxvi : cccc
- mmlxcvii : lxc
- mmmcmxcix : valid
- mmmdccclxxxviii : valid
- mmcccxlv : valid
- mcmxcvi : valid
- mmxdiii : xd
- mmcdxciii : valid
- mmccmdxxv : ccm, cmd
- xxx : valid
- cllx : ll
- dxxdmmv : dm, xd
- ccdddimdd : dd, im, ccd
- vlcxivxmcvxlc : vx, vl, lc, xm
- dvliilvcxvxvmli : vx, vl, il, vm, vc
- vvdlmivilxxdx : vv, vd, il, xd, lm, ivi
- dmxxcmilvcmllmv : dm, il, ll, vc, lm, xxc, xcm
- cmxdvlccddlxlxc : dd, vl, lc, xd, xlx, lxc, ccd, lxl
- xdixcllmvvlcmcm : vv, vl, lc, xd, ll, lm, ixc, cmc, xcl
- iixxcdvvlmilvdd : dd, vv, vl, vd, il, lm, xxc, ixx, iix, xcd
- ddmiixxcmccdcmm : dd, dm, cmc, cmm, xxc, ixx, dcm, cdc, xcm, ccd, iix
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
#4: Post edited
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- You may choose to take input in lower case instead, but your output must match the case of your input.- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ### Upper case test cases
- These reflect the case used in the rest of the challenge wording, although there is no requirement to use upper case for this challenge.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ### Lower case test cases
- These are the same test cases in lower case, in case you can benefit from taking lower case input.
- Note that if you take lower case input then you must also give lower case output.
- ```text
- i : valid
- v : valid
- x : valid
- l : valid
- c : valid
- d : valid
- m : valid
- ii : valid
- vv : vv
- xx : valid
- ll : ll
- cc : valid
- dd : dd
- mm : valid
- iii : valid
- vii : valid
- ivi : ivi
- iiv : iiv
- cci : valid
- ccv : valid
- ccx : valid
- ccl : valid
- ccc : valid
- ccd : ccd
- ccm : ccm
- iiii : iiii
- mldi : ld
- mxxc : xxc
- dciix : iix
- mcxxxx : xxxx
- mccccxvi : cccc
- mmlxcvii : lxc
- mmmcmxcix : valid
- mmmdccclxxxviii : valid
- mmcccxlv : valid
- mcmxcvi : valid
- mmxdiii : xd
- mmcdxciii : valid
- mmccmdxxv : ccm, cmd
- xxx : valid
- cllx : ll
- dxxdmmv : dm, xd
- ccdddimdd : dd, im, ccd
- vlcxivxmcvxlc : vx, vl, lc, xm
- dvliilvcxvxvmli : vx, vl, il, vm, vc
- vvdlmivilxxdx : vv, vd, il, xd, lm, ivi
- dmxxcmilvcmllmv : dm, il, ll, vc, lm, xxc, xcm
- cmxdvlccddlxlxc : dd, vl, lc, xd, xlx, lxc, ccd, lxl
- xdixcllmvvlcmcm : vv, vl, lc, xd, ll, lm, ixc, cmc, xcl
- iixxcdvvlmilvdd : dd, vv, vl, vd, il, lm, xxc, ixx, iix, xcd
- ddmiixxcmccdcmm : dd, dm, cmc, cmm, xxc, ixx, dcm, cdc, xcm, ccd, iix
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- - You may choose to take input in lower case instead, provided that you also use lower case in your output for an input that is not a valid Roman number.
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - Since the consistent value can be anything (and specifically does not need to contain Roman numerals), there is no requirement for it to be upper or lower case.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- - If you chose to take input in lower case, then this output string must also be in lower case.
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ### Upper case test cases
- These reflect the case used in the rest of the challenge wording, although there is no requirement to use upper case for this challenge.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ### Lower case test cases
- These are the same test cases in lower case, in case you can benefit from taking lower case input.
- Note that if you take lower case input then you must also give lower case output.
- ```text
- i : valid
- v : valid
- x : valid
- l : valid
- c : valid
- d : valid
- m : valid
- ii : valid
- vv : vv
- xx : valid
- ll : ll
- cc : valid
- dd : dd
- mm : valid
- iii : valid
- vii : valid
- ivi : ivi
- iiv : iiv
- cci : valid
- ccv : valid
- ccx : valid
- ccl : valid
- ccc : valid
- ccd : ccd
- ccm : ccm
- iiii : iiii
- mldi : ld
- mxxc : xxc
- dciix : iix
- mcxxxx : xxxx
- mccccxvi : cccc
- mmlxcvii : lxc
- mmmcmxcix : valid
- mmmdccclxxxviii : valid
- mmcccxlv : valid
- mcmxcvi : valid
- mmxdiii : xd
- mmcdxciii : valid
- mmccmdxxv : ccm, cmd
- xxx : valid
- cllx : ll
- dxxdmmv : dm, xd
- ccdddimdd : dd, im, ccd
- vlcxivxmcvxlc : vx, vl, lc, xm
- dvliilvcxvxvmli : vx, vl, il, vm, vc
- vvdlmivilxxdx : vv, vd, il, xd, lm, ivi
- dmxxcmilvcmllmv : dm, il, ll, vc, lm, xxc, xcm
- cmxdvlccddlxlxc : dd, vl, lc, xd, xlx, lxc, ccd, lxl
- xdixcllmvvlcmcm : vv, vl, lc, xd, ll, lm, ixc, cmc, xcl
- iixxcdvvlmilvdd : dd, vv, vl, vd, il, lm, xxc, ixx, iix, xcd
- ddmiixxcmccdcmm : dd, dm, cmc, cmm, xxc, ixx, dcm, cdc, xcm, ccd, iix
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
#3: Post edited
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- - You may choose to take input in lower case instead, but your output must match the case of your input.
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ### Upper case test cases
- These reflect the case used in the rest of the challenge wording, although there is no requirement to use upper case for this challenge.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ### Lower case test cases
- These are the same test cases in lower case, in case you can benefit from taking lower case input.
- Note that if you take lower case input then you must also give lower case output.
- ```text
- i : valid
- v : valid
- x : valid
- l : valid
- c : valid
- d : valid
- m : valid
- ii : valid
- vv : vv
- xx : valid
- ll : ll
- cc : valid
- dd : dd
- mm : valid
- iii : valid
- vii : valid
- ivi : ivi
- iiv : iiv
- cci : valid
- ccv : valid
- ccx : valid
- ccl : valid
- ccc : valid
- ccd : ccd
- ccm : ccm
- iiii : iiii
- mldi : ld
- mxxc : xxc
- dciix : iix
- mcxxxx : xxxx
- mccccxvi : cccc
- mmlxcvii : lxc
- mmmcmxcix : valid
- mmmdccclxxxviii : valid
- mmcccxlv : valid
- mcmxcvi : valid
- mmxdiii : xd
- mmcdxciii : valid
- mmccmdxxv : ccm, cmd
- xxx : valid
- cllx : ll
- dxxdmmv : dm, xd
- ccdddimdd : dd, im, ccd
- vlcxivxmcvxlc : vx, vl, lc, xm
- dvliilvcxvxvmli : vx, vl, il, vm, vc
- vvdlmivilxxdx : vv, vd, il, xd, lm, ivi
- dmxxcmilvcmllmv : dm, il, ll, vc, lm, xxc, xcm
- cmxdvlccddlxlxc : dd, vl, lc, xd, xlx, lxc, ccd, lxl
- xdixcllmvvlcmcm : vv, vl, lc, xd, ll, lm, ixc, cmc, xcl
- iixxcdvvlmilvdd : dd, vv, vl, vd, il, lm, xxc, ixx, iix, xcd
- ddmiixxcmccdcmm : dd, dm, cmc, cmm, xxc, ixx, dcm, cdc, xcm, ccd, iix
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
#2: Post edited
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
- Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below.
- ## Relevant fact
- This challenge is based around the following fact:
- > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring:
- >
- > ```text
- > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX"
- > ```
- This works for any length string, if a valid Roman number is defined as follows:
- ## Valid Roman numbers[^1]
- - Each numeral appears no more than 3 times consecutively.
- - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively.
- - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components.
- Decimal | Thousands | Hundreds | Tens | Units
- ------- | --------- | -------- | ---- | -
- 1 | M | C | X | I
- 2 | MM | CC | XX | II
- 3 | MMM | CCC | XXX | III
- 4 | | CD | XL | IV
- 5 | | D | L | V
- 6 | | DC | LX | VI
- 7 | | DCC | LXX | VII
- 8 | | DCCC | LXXX | VIII
- 9 | | CM | XC | IX
- So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`.
- This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid.
- ## Input
- - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`).
- - The string will have length at least 1.
- - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals).
- ## Output
- - If the input is a valid Roman number, output a consistent value indicating this.
- - Consistent means that the value must be the same for all valid Roman numbers.
- - The output for a valid Roman number must not be one of the strings from the list of 50.
- - If the input is not a valid Roman number, output exactly 1 string from the list of 50.
- - The output in this case must be a substring of the input.
- - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more).
- ## Examples
- ### A valid Roman number
- The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings.
- ### A string that is not a valid Roman number
- Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`.
- ### An invalid string with more than 1 potential output
- The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings.
- ## Test cases
- Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect.
- The output "VALID" is just an example - for an input that is a valid Roman number you may choose to output any consistent value distinct from the 50 strings.
- ```text
- I : VALID
- V : VALID
- X : VALID
- L : VALID
- C : VALID
- D : VALID
- M : VALID
- II : VALID
- VV : VV
- XX : VALID
- LL : LL
- CC : VALID
- DD : DD
- MM : VALID
- III : VALID
- VII : VALID
- IVI : IVI
- IIV : IIV
- CCI : VALID
- CCV : VALID
- CCX : VALID
- CCL : VALID
- CCC : VALID
- CCD : CCD
- CCM : CCM
- IIII : IIII
- MLDI : LD
- MXXC : XXC
- DCIIX : IIX
- MCXXXX : XXXX
- MCCCCXVI : CCCC
- MMLXCVII : LXC
- MMMCMXCIX : VALID
- MMMDCCCLXXXVIII : VALID
- MMCCCXLV : VALID
- MCMXCVI : VALID
- MMXDIII : XD
- MMCDXCIII : VALID
- MMCCMDXXV : CCM, CMD
- XXX : VALID
- CLLX : LL
- DXXDMMV : DM, XD
- CCDDDIMDD : DD, IM, CCD
- VLCXIVXMCVXLC : VX, VL, LC, XM
- DVLIILVCXVXVMLI : VX, VL, IL, VM, VC
- VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI
- DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM
- CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL
- XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL
- IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD
- DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX
- ```
- ## Scoring
- This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins.
- > Explanations are optional, but I'm more likely to upvote answers that have one.
- [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag"
- [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.
#1: Initial revision
The 50 substrings that validate any string of Roman numerals
Given a string of Roman numerals, decide whether it forms a valid Roman number. If not, output the substring that proves this, from the list of 50 strings described below. ## Relevant fact This challenge is based around the following fact: > A string of Roman numerals is a valid Roman number if and only if it contains none of the following 50 strings as a substring: > > ```text > "CCCC", "CCD", "CCM", "CDC", "CMC", "CMD", "CMM", "DCD", "DCM", "DD", "DM", "IC", "ID", "IIII", "IIV", "IIX", "IL", "IM", "IVI", "IXC", "IXI", "IXL", "IXV", "IXX", "LC", "LD", "LL", "LM", "LXC", "LXL", "MMMM", "VC", "VD", "VIV", "VIX", "VL", "VM", "VV", "VX", "XCC", "XCD", "XCL", "XCM", "XCX", "XD", "XLX", "XM", "XXC", "XXL", "XXXX" > ``` This works for any length string, if a valid Roman number is defined as follows: ## Valid Roman numbers[^1] - Each numeral appears no more than 3 times consecutively. - Each of `V` (5), `L` (50), and `D` (500) appears no more than once consecutively. - A Roman number is constructed by concatenating the strings representing its ***thousands***, ***hundreds***, ***tens***, and ***units*** components. Decimal | Thousands | Hundreds | Tens | Units ------- | --------- | -------- | ---- | - 1 | M | C | X | I 2 | MM | CC | XX | II 3 | MMM | CCC | XXX | III 4 | | CD | XL | IV 5 | | D | L | V 6 | | DC | LX | VI 7 | | DCC | LXX | VII 8 | | DCCC | LXXX | VIII 9 | | CM | XC | IX So, for example, 2345 would be represented as the concatenation of `MM` (for 2000), `CCC` (for 300), `XL` (for 40), and `V` (for 5), or `MMCCCXLV`. This defines a unique correct representation for each number from 1 to 3999 (4000 and above are not representable without breaking the first two rules). The 50 substrings method described above will identify each of these 3999 strings as valid, and all other strings of Roman numerals as invalid. ## Input - A string containing only Roman numerals (`I`, `V`, `X`, `L`, `C`, `D`, `M`). - The string will have length at least 1. - The string will have length at most 15 (this is the length of the longest valid string of Roman numerals). ## Output - If the input is a valid Roman number, output a consistent value indicating this. - Consistent means that the value must be the same for all valid Roman numbers. - The output for a valid Roman number must not be one of the strings from the list of 50. - If the input is not a valid Roman number, output exactly 1 string from the list of 50. - The output in this case must be a substring of the input. - If the input has 2 or more of the strings from the list of 50 as substrings, you may choose any 1 of them to be the output, but you must choose only 1 of them (you must not output 2 or more). ## Examples ### A valid Roman number The input `MCMXCVI` is the unique correct representation of 1996. It contains none of the 50 strings. ### A string that is not a valid Roman number Although the input `MMXDIII` might be suspected of representing 2493, it is not the unique correct representation of this number (which is `MMCDXCIII`). Note that it has `XD` as a substring, identifying it as invalid. The only correct output is therefore `XD`. ### An invalid string with more than 1 potential output The input `MMCCMDXXV` has 2 substrings that make it invalid, so either `CCM` or `CMD` would be correct outputs. It would not be correct to output both of these, or to output their overlap `CCMD`, as this is not one of the 50 strings. ## Test cases Test cases are in the format `INPUT : VALID, OUTPUTS`. Note that only one of the valid outputs can be chosen - outputting 2 or more is incorrect. ```text I : VALID V : VALID X : VALID L : VALID C : VALID D : VALID M : VALID II : VALID VV : VV XX : VALID LL : LL CC : VALID DD : DD MM : VALID III : VALID VII : VALID IVI : IVI IIV : IIV CCI : VALID CCV : VALID CCX : VALID CCL : VALID CCC : VALID CCD : CCD CCM : CCM IIII : IIII MLDI : LD MXXC : XXC DCIIX : IIX MCXXXX : XXXX MCCCCXVI : CCCC MMLXCVII : LXC MMMCMXCIX : VALID MMMDCCCLXXXVIII : VALID MMCCCXLV : VALID MCMXCVI : VALID MMXDIII : XD MMCDXCIII : VALID MMCCMDXXV : CCM, CMD XXX : VALID CLLX : LL DXXDMMV : DM, XD CCDDDIMDD : DD, IM, CCD VLCXIVXMCVXLC : VX, VL, LC, XM DVLIILVCXVXVMLI : VX, VL, IL, VM, VC VVDLMIVILXXDX : VV, VD, IL, XD, LM, IVI DMXXCMILVCMLLMV : DM, IL, LL, VC, LM, XXC, XCM CMXDVLCCDDLXLXC : DD, VL, LC, XD, XLX, LXC, CCD, LXL XDIXCLLMVVLCMCM : VV, VL, LC, XD, LL, LM, IXC, CMC, XCL IIXXCDVVLMILVDD : DD, VV, VL, VD, IL, LM, XXC, IXX, IIX, XCD DDMIIXXCMCCDCMM : DD, DM, CMC, CMM, XXC, IXX, DCM, CDC, XCM, CCD, IIX ``` ## Scoring This is a [code golf challenge]. Your score is the number of bytes in your code. Lowest score for each language wins. > Explanations are optional, but I'm more likely to upvote answers that have one. [code golf challenge]: https://codegolf.codidact.com/categories/49/tags/4274 "The code-golf tag" [^1]: This is a common modern set of rules, described as [Standard form](https://en.wikipedia.org/wiki/Roman_numerals#Standard_form) on Wikipedia. It does not reflect all usages during history, but will be the basis of this challenge, since otherwise the 50 substrings approach does not work.