2025-05-13 - 2026-05-13
Overview
34 Issues closed from 1 user
Closed
#10 Store version numbers for the individual sources
Closed
#100 Don't store JMdict_Sense.language when building with translations from a single language
Closed
#101 Externalize JMdict_Sense.type into a separate table
Closed
#96 Use elementId for XREF__JMdict_KanjiElement__KANJIDIC_Character
Closed
#94 Don't store kanji + reading in xref tables
Closed
#65 Omit rows with basescore 0 in JMdict_EntryScore
Closed
#67 Create separate tables for KANJIDIC_Character's grade/frequency/jlpt
Closed
#85 Keep local copies of all datasources
Closed
#95 Use elementId to refer to sense restrictions
Closed
#98 Make kanji and reading elements unique from each other by embedding type
Closed
#97 Make elementId into a composite of entryId and the ordering number
Closed
#35 Word search: support globs
Closed
#60 Create deconjugation/lemmatization algorithm
Closed
#76 Generate "match spans", detailing where the search results matched the searchword
Closed
#71 Always order exact matches first in word search, no matter commonness
Closed
#70 Remove duplicates from word search join
Closed
#21 Store KANJIDIC onyomi as hiragana
Closed
#41 Create query cli tool for querying a single JMdict entryId
Closed
#64 Reduce size of type enum by reducing to char(1) (r, k)
Closed
#52 Create integer ids for Reading/Kanji elements to reduce space usage
Closed
#57 Filter inputs for nix source
Closed
#61 Export function for creating an empty database
Closed
#55 Order english queries by score
Closed
#48 Add function to filter kanji from a string by what's available in the dictionary
Closed
#44 Precalculate search scores with a table and a bunch of triggers
Closed
#50 List words by JLPT
Closed
#49 List kanji by JLPT
Closed
#29 Create radical search which retrieves remaining possible combinations
Closed
#47 Add orderNum to KANJIDIC kunyomi, onyomi and meaning
Closed
#42 Renormalize KANJIDIC radical data
Closed
#39 Word search: pagination
Closed
#31 Add JLPT tags to WordSearchResult
Closed
#32 Extend language source data for WordSearchSense (upon search)
Closed
#30 Add basic info (reading, kanji) to word search xrefs (antonyms, seealsos)
77 Issues created by 1 user
Opened
#30 Add basic info (reading, kanji) to word search xrefs (antonyms, seealsos)
Opened
#31 Add JLPT tags to WordSearchResult
Opened
#32 Extend language source data for WordSearchSense (upon search)
Opened
#33 Ensure consistent naming scheme for tables
Opened
#34 Group kanji readings/meanings by rmgroup parent nodes
Opened
#35 Word search: support globs
Opened
#36 Word search: support a variety of tags
Opened
#37 Word search: support mixed input (kanji, kana, romaji)
Opened
#38 Add wanikani levels
Opened
#39 Word search: pagination
Opened
#40 Word search: optimize word regrouping
Opened
#41 Create query cli tool for querying a single JMdict entryId
Opened
#42 Renormalize KANJIDIC radical data
Opened
#43 Add additional radical data from KanjiAlive
Opened
#44 Precalculate search scores with a table and a bunch of triggers
Opened
#45 Add nix package for dart doc
Opened
#46 Create "furigana segmentation" (or "kanji/kana alignment") algorithm
Opened
#47 Add orderNum to KANJIDIC kunyomi, onyomi and meaning
Opened
#48 Add function to filter kanji from a string by what's available in the dictionary
Opened
#49 List kanji by JLPT
Opened
#50 List words by JLPT
Opened
#51 Measure time taken during substeps of data ingestion
Opened
#52 Create integer ids for Reading/Kanji elements to reduce space usage
Opened
#53 Find a better way to manage/order migrations
Opened
#54 Split out Kanji/Reading element's readingDoesNotMatchKanji/news/ichi/spec/gai/nf into separate tables
Opened
#55 Order english queries by score
Opened
#56 Validate input before using it in FTS5 queries
Opened
#57 Filter inputs for nix source
Opened
#58 Add an option to only ingest a subset of data for development speed
Opened
#59 Add developer option to log EXPLAIN QUERYoutput
Opened
#60 Create deconjugation/lemmatization algorithm
Opened
#61 Export function for creating an empty database
Opened
#62 Add pairs of lookalike kanjis to use for word search
Opened
#63 Consider embedding vibrato morph analyzer
Opened
#64 Reduce size of type enum by reducing to char(1) (r, k)
Opened
#65 Omit rows with basescore 0 in JMdict_EntryScore
Opened
#66 Use INTEGER for JMdict_JLPTTag.jlptLevel
Opened
#67 Create separate tables for KANJIDIC_Character's grade/frequency/jlpt
Opened
#68 Use integer enum for KANJIDIC_Codepoint.type
Opened
#69 Figure out what's up with the scoring here
Opened
#70 Remove duplicates from word search join
Opened
#71 Always order exact matches first in word search, no matter commonness
Opened
#72 Improve ambiguous crossreferences by choosing those that share kanji first
Opened
#73 Add kanji variant usage percentage
Opened
#74 Create performance benchmarks
Opened
#75 Add audio samples
Opened
#76 Generate "match spans", detailing where the search results matched the searchword
Opened
#77 Generate diagram for the lemmatization transducer
Opened
#78 Add common followup verbs to lemmatizer
Opened
#79 Connect lemmatizer to word search
Opened
#80 Cross reference dictionary to aid lemmatizer
Opened
#81 Add 〜た/〜だ verb followups for lemmatizer
Opened
#82 Add common volitional verb followups
Opened
#83 Cache lemmatization results
Opened
#84 Reenable test concurrency
Opened
#85 Keep local copies of all datasources
Opened
#86 Add "kanji example word" mode to word search
Opened
#87 Add kanjivg data to sqlite db
Opened
#88 Add 漢検 level ratings to kanji
Opened
#89 Add wikipedia references for certain dictionary entries
Opened
#90 Support wildcards in word search
Opened
#91 Test json de/serialization roundtrip for all models
Opened
#92 Find datasource for idioms and fixed phrases
Opened
#93 Vendor custom SQLite build with ICU extension enabled
Opened
#94 Don't store kanji + reading in xref tables
Opened
#95 Use elementId to refer to sense restrictions
Opened
#96 Use elementId for XREF__JMdict_KanjiElement__KANJIDIC_Character
Opened
#97 Make elementId into a composite of entryId and the ordering number
Opened
#98 Make kanji and reading elements unique from each other by embedding type
Opened
#99 Disable constraints causing sqlite autoindex creation when compiling for production usage
Opened
#100 Don't store JMdict_Sense.language when building with translations from a single language
Opened
#101 Externalize JMdict_Sense.type into a separate table
Opened
#102 Retrieve JMdict_SenseGlossaryType information
Opened
#103 Get rid of JMdict_SenseGlossary_byPhrase index for production use
Opened
#104 Add 放送禁止用語 tags
Opened
#105 Register commit hash for 'datasources' repo
Opened
#106 Switch to the XML-NG version of JMDict (once it releases)
18 Unresolved Conversations
Open
#6
Create tool for diffing two instances of the database
Open
#23
Word search: kana type independence
Open
#13
Word search: automatically deconjugate words
Open
#11
Add ENAMDICT/JMnedict
Open
#16
Find source for word pitch data
Open
#18
Generate conjugation tables
Open
#22
Kanji search: query for example words
Open
#25
Add system for normalizing kanji variants
Open
#5
Add tatoeba sentences
Open
#27
Deal with remaining characters from kana -> romaji transliteration
Open
#28
Duplicate kana on ヽ, ヾ, ゝ, ゞ during K -> R transliteration
Open
#26
Add source of grammar data
Open
#20
Normalize numbers and other symbols during search
Open
#19
Word search: add levenshtein thresholds
Open
#17
Add kanji freq data from https://scriptin.github.io/kanji-frequency/
Open
#12
Add stroke count data from radkfile
Open
#9
Add progress bars while creating the database
Open
#7
Build with CI