Life is full of Serendipity.
(In Korean: ÀλýÀ̶õ ¾ÆÁÖ ¸Õ ¿¾³¯¿¡ ½ÅÀÌ
³ª¸¦ À§Çؼ ¹Ì¸® ÁغñÇØ ³õÀº ¼ÒÁßÇÏ°í °ªÁø °ÍµéÀ» Çϳª¾¿ ã¾Æ °¡´Â ±â»Ý.)
-- Kyuseok Shim (In the year of 1999)--
Kyuseok Shim
School of Electrical Engineering and Computer Science
Seoul National University
Kwanak P.O. Box 34
SEOUL 151-742, KOREA
Office (SNU): Building 302 - Room 531
Telephone: 82-2-880-7269 (Office) , 82-2-880-1741 (Secretary) , 82-2-880-1758 (KDD Lab)
Fax: 82-2-871-5974
Email: shim[AT]ee[DOT]snu[DOT]ac[DOT]kr

Information for Prospective Students
(´ëÇпø¿¡ °ü½ÉÀÖ´Â ºÐµéÀ» À§ÇÑ Á¤º¸)
Knowledge Discovery and Database Research Laboratory (µ¥ÀÌÅ͸¶ÀÌ´×¹× µ¥ÀÌÅͺ£À̽º¿¬±¸½Ç) Home Page
Do your best and God will do the rest.

I am a Professor at
School of Electrical Engineering and Computer Science
at Seoul National University, Korea.
I am also leading the
Knowledge Discovery and Database Research Laboratory at
Seoul National University.
Before that, I was an Assistant Professor at
Computer Science Department of
KAIST (Korea),
a member of technical staff (MTS) at
Bell Laboratories (Murray Hill)
and a research staff at
IBM Almaden Research Center
(San Jose).
In Bell Laboratories, I was one of the key contributors to
the Serendip data mining project and,
in IBM Almaden Research Center, I was
a member of
Quest Data Mining project.
I have been visiting Data Management, Exploration & Mining group in Microsoft Research
as a visiting scientist for the summer/winter breaks.
I also worked with Surajit Chaudhuri as a summer intern for two summers at
Hewlett-Packard Laboratories.
I received a Ph.D under the supervision of Professor Timos Sellis
in Computer Science from University of Maryland at College Park in 1993.
I received B.S. degree in Electrical Engineering from Seoul National University in 1986, and MS degree in
Computer Science from University of
Maryland at College Park in 1988.
My graduate study was supported by
University of Maryland at College Park,
Korean Government Overseas Full Merit
Scholarship and
Hewlett-Packard Laboratories.
I have been working in the area of data mining, data privacy, embedded flash memory database systems, semi-structured data (XML), stream data, histogram, query processing, query optimization and data warehousing.

Publication Citations
Recent Professional Activities
- 2015 IEEE International Conference of Data Engineering (ICDE'15): Program Committee Co-Chair (Research Track)
- 2014 World Wide Web Conference: Program Committee Co-Chair (Research Track)
- 2014 VLDB Conference: Track Chair (also Research Track Associate Editor for PVLDB Vol 7)
- 2014 APWeb Conference: Tutorial Chair
- 2013 ACM SIGKDD Conference: Program Committee Member
- 2013 World Wide Web Conference: Track Chair (Web Mining Track)
- 2013 ICDT Conference: Program Committee Member
- 2013 ACM SIGMOD Conference: Demonstration Program Committee Member
- 2013 ACM CIKM Conference (CIKM'13): Senior Program Committee Member (Databases Track)
- 2013 IEEE International Conference of Data Mining (ICDM'13): Vice Chair
- 2013 IEEE International Conference of Big Data (IEEE BigData 2013): Program Committee Member
- 2013 PAKDD Conference: Senior Program Committee Member
- 2013 WAIM Conference: Workshop Co-Chair & Program Committe Member
- 2013 MDM Conference: Program Committee Member
- 2013 ASONAM Conference: Program Committee Member
- 2012 ACM SIGKDD Conference: Program Committee Member
- 2012 ACM SIGMOD Conference: Program Committee Member
- 2012 IEEE International Conference of Data Mining (ICDM'12): Program Committee Member
- 2012 IEEE International Conference of Data Engineering (ICDE'12): Program Committee Member
- 2012 VLDB Conference, Istanbul, Turkey: Program Committee Member
- 2012 ACM CIKM Conference (CIKM'02): Program Committee Member
- 2012 PAKDD Conference: Senior Program Committee Member
- 2012 DASFAA Conference, Pusan, Korea: Panel Chair
- 2011 VLDB Conference: Program Committee Member
- 2011 IEEE International Conference of Data Mining (ICDM'11): Program Committee Member
- 2011 IEEE International Conference of Data Engineering (ICDE'11): Program Committee Member (Industrial Track)
- 2011 SIAM International Conference on Data Mining (SDM'11): Program Committee Member (Industrial Track)
- 2011 ACM CIKM Conference (CIKM'01): Program Committee Member (Knowledge Management track)
- 2011 International Conference on Emerging Databases (EDB'11): Program Committee Co-Chair
- 2011 International Conference on Advanced Data Mining and Applications (ADMA'11): Vice PC Chair
- 2010 IEEE International Conference of Data Mining (ICDM'10): Program Committee Vice-Chair
- 2010 VLDB Conference: Program Committee Member
- 2010 International Conference on Extending Database Technology (EDBT'10): Program Committee Member
- 2010 World Wide Web Conference: Area Chair (Data Mining and Machine Learning Track)
- 2009 ACM SIGKDD Conference: Program Committee Member
- 2009 ACM SIGMOD Conference: Program Committee Member
- 2009 World Wide Web Conference: Program Committee Member (Data Mining Track)
- 2009 ICDT Conference: Program Committee Member
- 2009 IEEE International Conference of Data Engineering (ICDE'09): Vice-Chair (Mining Data and Knowledge Discovery)
- 2008 IEEE International Conference of Data Mining (ICDM'08): Program Committee Vice-Chair
- 2008 ACM SIGKDD Conference: Program Committee Member
- 2008 VLDB Conference: Program Committee Member
- 2008 World Wide Web Conference (WWW'08):
Program Committee Member (Data Mining Track)
- 2008 ACM SIGMOD Conference: Program Committee Member
- 2008 IEEE International Conference of Data Engineering (ICDE'08): Program Committee Member (Mining Data, Text, and the Web)
- 2008 IEEE International Conference of Cooperative Information Systems (CoopIS'08): Program Committee Member
- 2007 World Wide Web Conference (WWW'07): Deputy Chair (Data Mining)
- 2007 International Conference on Scientific and Statistical Database (SSDBM'07): Program Committee Member
- 2007 ACM SIGMOD Conference: Program Committee Member
- 2007 ACM SIGKDD Conference: Senior Program Committee Member
- 2007 IEEE International Conference of Data Engineering (ICDE'07): Vice-Chair (Mining Data, Text, and the Web)
- 2006 ACM SIGMOD Conference: Program Committee Member
- 2006 IEEE International Conference of Data Engineering (ICDE'06): Program Committee Member
- 2006 VLDB Conference: Tutorial Co-Chair & Program Committee Member
- 2006 International Conference on Extending Database Technology (EDBT'06): Program Committee Member
Awards and Honors
- Best Teacher Award (For teaching Introduction to Algorithm Class from Seoul National University), 2010
- Best Teacher Award (For teaching Data Structure and Algorithm Class from Seoul National University), 2004
- Best Teacher Award (For teaching Data Structure and Algorithm Class from Seoul National University), 2003
- Korean Government Overseas Full Merit Scholarship (For Ph.D. studies from Korean Government), 1990 -- 1993
- Magna Cum Laude (For B.E. Degree from Seoul National University), 1986
- Minister of Education Award (For Software Contest from Ministry of Science and Technology, Korea), 1985
- Minister of Trading Award (For Software Contest from Ministry of Science and Technology, Korea), 1985
Research Interests
- Data Mining and Knowledge Discovery
- XML and Semi-structured Data
- Internet Stream Data
- Histogram
- Query Processing and Optimization
- Data Warehousing and OLAP
- Data Privacy and Security
- Embedded DBMS
Work Experiences
- KAIST, Taejon, Korea (Feb. 1999 -- Feb. 2002)
- Microsoft Research, Redmond, WA (Jan., Feb., July and Aug. of 2001, Jan., Feb., June, July and Aug. of 2002)
- Bell Laboratories, Murray Hill, NJ (March 1996 -- Feburuary 2000, June -- August 2000).
- IBM Almaden Research Center, San Jose, CA (November 1994 -- March 1996)
- Federal Reserve Board, Washington, DC (September 1993 -- November 1994)
- Hewlett-Packard Laboratories, Palo Alto, CA (Summers of 1992 and 1993)
Projects
Degrees
- Ph.D. in Computer Science, University of Maryland, College Park, 1993.
- M.S. in Computer Science, University of Maryland, College Park, 1988.
- B.E. (Ranked Top) in Electrical Engineering, Seoul National University, Seoul, Korea, 1986.

Tutorial Talks
- "Offline and Stream algorithms for efficient computation of synopsis structures", VLDB Conference, 2005
- "Analyzing and Mining Data Streams", PAKDD Conference, 2003
- "Storage and Retrieval of XML Data using Relational Databases", IEEE ICDE Conference, 2003
- "Storage and Retrieval of XML Data using Relational Databases", PAKDD Conference, 2002
- "Storage and Retrieval of XML data using Relational Databases", VLDB Conference, 2001
- "Recent Advances in Data Mining Algorithms for Large Databases", PAKDD Conference, 2001
- "Recent Advances in Data Mining Algorithms on Large Databases", CIKM Conference, 1999
- "Data Mining on Large Databases", IEEE ICDE Conference, 1999
- "Scalable Algorithms for Mining Databases", ACM SIGKDD 1999 (1.9 MB)

¸¶ÀÌÅ©·Î¼ÒÇÁÆ®¿þ¾î ÀâÁö¿¡ ÃÖ±Ù¿¡ ±â°íÇÑ ±Ûµé
PUBLICATIONS
[List of publications from the DBLP Bibliography Server]
Data Privacy and Security
- "Approximate Algorithms with Generalizing Attribute Values for
k-Anonymity "
(with Hyoungmin Park)
Information Systems Journal, Elsevier, 35(8): 2010
- "Approximate Algorithms for K-Anonymity"
(with Hyoungmin Park)
ACM SIGMOD International Conference on Management of Data, Beijing, China, 2007
Data Mining and Knowledge Discovery
- "CATCH: A detecting algorithm for coalition attacks of hit inflation in internet advertising"
(with Chulyun Kim and Hui Miao)
Information Systems Journal, Elsevier, 36(8): 2011
- "TEXT: Automatic Template Extraction from Heterogenous Web Pages"
(with Chulyun Kim)
IEEE Transactions on Knowledge and Data Engineering Journal, 23(4): 2011
- "SQUIRE: Sequential Pattern Mining with Quantities"
(with Chulyun Kim, JongHwa Lim and Raymond T. Ng)
Journal of Systems and Software, Elsevier, 80(10): 2007
- "SQUIRE: Sequential Pattern Mining with Quantities"
(with Chulyun Kim, JongHwa Lim and Raymond T. Ng)
the 20th International Conference on IEEE Data Engineering, Boston, USA, 2004
- "WALRUS: A Similarity Retrieval Algorithm for Image Databases"
(with Apostol Natsev and Rajeev Rastogi)
IEEE Transactions on Knowledge and Data Engineering Journal 16(3): 2004
Query Processing and Optimization
- "Efficient Processing of Substring Match Queries with Inverted Variable-length Gram Indexes"
(with Younghoon Kim, Hyoungmin Park and Kyoung-Gu Woo)
To Appear to Information Sciences Journal, Elsevier
- "Efficient Top-k Algorithms for Approximate Substring Matching"
(with Younghoon Kim)
ACM SIGMOD International Conference on Management of Data, New York, USA, 2013
- "Parallel Top-K Similarity Join Algorithms Using MapReduce"
(with Younghoon Kim)
the 28th International Conference on IEEE Data Engineering, Washington D. C. USA, 2012
- "Similarity Join Size Estimation using Locality Sensitive Hashing"
(with Hongrae Lee and Raymond Ng)
the 37th International Conference on VLDB, Seattle, USA, 2011
- "Power-Law Based Estimation of Set Similarity Join Size"
(with Hongrae Lee and Raymond Ng)
the 35th International Conference on VLDB, Lyon, France, 2009
- "Approximate Substring Selectivity Estimation"
(with Hongrae Lee and Raymond Ng)
the 12th International Conference on Extending Database Technology (EDBT), March, 2009
- "Wavelet Synopsis for Hierarchical Range Queries with Workloads"
(with Sudipto Guha and Hyoungmin Park)
VLDB Journal 17(5), August 2008
- "Extending Q-Grams to Estimate Selectivity of String Matching with Edit Distance"
(with Hongrae Lee and Raymond Ng)
the 33th International Conference on VLDB, Vienna, Austria, 2007
- "A Note on Linear Time Algorithms for Maximum Error Histograms"
(with Sudipto Guha)
IEEE Transactions on Knowledge and Data Engineering Journal 19:(7), 2007
- "Approximation and Streaming Algorithms for Histogram Construction Problems"
(with Sudipto Guha and Nick Koudas)
ACM Transaction on Database Systems 31:(1), March 2006
- "Storing XML (with XSD) in SQL Databases: Interplay of Logical and Physical designs"
(with Zhiyuan Chen, Surajit Chaudhuri and Yuqing Wu)
IEEE Transactions on Knowledge and Data Engineering Journal 17:(12), 2005
- "An Adaptive Path Index for XML Data using the Query Workload"
(with Chin-Wan Chung and Jun-Ki Min)
Information Systems Journal 30:(6) Elsevier, 2005
- "Storing XML (with XSD) in SQL Databases: Interplay of Logical and Physical designs "
(with Zhiyuan Chen, Surajit Chaudhuri and Yuqing Wu)
the 20th International Conference on IEEE Data Engineering, Boston, USA, 2004
- "XWAVE: Approximate Extended Wavelets for Streaming Data"
(with Sudipto Guha and Chulyun Kim)
the 30th International Conference on VLDB, Toronto, Ontario, Canada, 2004
- "REHIST: Relative Error Histogram Construction Algorithms"
(with Sudipto Guha and Jungchul Woo)
the 30th International Conference on VLDB, Toronto, Ontario, Canada, 2004
- "Optimizing Queries with Materialized Views"
(with Surajit Chaudhuri, Ravi Krishnamurthy and Spyros Potamianos)
Materialized Views (Techniques, Implementations, and Applications), Edited by Ashish Gupta and Inderpal Singh Mumick, The MIT Press, 1999
- "Parametric Query Optimization"
(with Yannis E. Ioannidis, Raymond T. Ng and Timos K. Sellis)
the 18th International Conference on VLDB, Vancouver, Canada, 1992