[패스트캠퍼스] 데이터 분석 부트캠프 16기 3주차 - Python 자료형

숫자형 (number formatting)

특징 : 따옴표(quotation) 필요없음

int : 정수 (integer)

1, -2, 3

a = 9
type(a)
# int

float : 실수 (floating point number)

1.5, -2.3, 3.4

a = 9.99
type(a)
# float

사칙 연산자(부호) (arithmetic operators)

+ : 더하기

- : 빼기

* : 곱하기

/ : 나누기

** : 제곱

// : 몫

% : 나머지

x = 3
y = 5

print(x ** y) # x를 y번 곱한 것. c^d
# 243
print(x // y) # x를 y로 나눈 몫.
# 0
print(x % y)  # x를 y로 나눈 나머지.
# 3

참고 - https://queirozf.com/entries/python-number-formatting-examples

문자형 (string formatting)

특징 : 따옴표(quotation) 붙여야 함 (큰따옴표, 작은따옴표 상관없음)

str : 문자 (string)

"Hello World!", 'Welcome'

a = 'hymdro'
type(a)
# str

aphostrophe 앞에는 backslash 추가해야 함

print('It\'s Cool')
# It's Cool

print("""Joseph \"Joe\" Biden""")
# Joseph "Joe" Biden

따옴표 세개면 줄넘김 가능

"""a
b
c"""

'''a
b
c'''

# "Failure is simply the opportunity to begin again." he says."
# 3가지 방법으로 표현해주세요.
# (1) """
# (2) ''
# (3) \

# (1)
print(""" "Failure is simply the opportunity to begin again." he says." """)

# (2)
print('"Failure is simply the opportunity to begin again." he says."')

# (3)
print("\"Failure is simply the opportunity to begin again.\" he says.\"")

문자의 덧셈 & 곱셉

문자(str)끼리 덧셈 가능 (문자(str)와 숫자(int)는 덧셈 안됨)

a = 'H'
b = 'o'
c = 'm'
d = 'e'

a+b+c+d
# 'Home'

문자(str)와 숫자(int) 곱셉 가능 (n번 반복)

a = 'H'
b = 'o'
c = 'm'
d = 'e'

a*5 # a를 5번 반복
# 'HHHHH'

더한 문자(str)끼리 곱셉 가능 (문자끼리는 곱셈 안됨)

a = 'H'
b = 'o'
c = 'm'
d = 'e'

(a+b+c+d) * 5
# 'HomeHomeHomeHomeHome'

문자 인덱싱/슬라이싱 (string indexing/slicing) : 주어진 문자에서 필요한 부분만 가져올 수 있음

# syntax
"str"[시작:끝:단위]

"아버지가방에들어가신다"[0:9:3]
# '아가들'

"아버지가방에들어가신다"[-6:-1:2]
# '에어신'

a = "Python is the easiest computer language in the world"

print(a[22:39])
# computer language

print(a[-9:])
# the world

weight = '10kg'
int(weight[:2])

참고 - https://wikidocs.net/2838

[문자열 포맷팅 (string formatting)] f-strings method (formatted string literals)

introduced with Python 3.6

quotation 앞에 f를 붙이면 string 사이에 parameter 입력 가능

parameter는 braces {}에 선언

f의 대소문자 구분 필요없음

# syntax
데이터1 = 123
데이터2 = 문자
f"숫자는 {데이터1}, 문자는 {데이터2}."
F"숫자는 {데이터1}, 문자는 {데이터2}."

# example
나이 = 28
직업 = "학생"
print(f"나는 {나이}살이고 {직업}이다.")
# 나는 28살이고 학생이다.

사칙 연산 (arithmetic operation) 가능

f"{2 * 37}"
# '74'

함수 호출 (call function) 가능

def to_lowercase(input):
	return input.lower()

name = "Eric Idle"
f"{to_lowercase(name)} is funny."
# 'eric idle is funny.'

class 문법을 이용한 객체(object) 사용 가능

class Comedian:
    def __init__(self, first_name, last_name, age):
        self.first_name = first_name
        self.last_name = last_name
        self.age = age

    def __str__(self):
        return f"{self.first_name} {self.last_name} is {self.age}."

    def __repr__(self):
        return f"{self.first_name} {self.last_name} is {self.age}. Surprise!"

new_comedian = Comedian("Eric", "Idle", "74")

f"{new_comedian}"
# 'Eric Idle is 74.'

# using conversion flag !r
f"{new_comedian!r}"
# 'Eric Idle is 74. Surprise!'

Multiline 사용 가능

parameter가 붙은 모든 줄에 f 붙여야 함

name = "Eric"
profession = "comedian"
affiliation = "Monty Python"
message = (
    f"Hi {name}. "
    f"You are a {profession}. "
    f"You were in {affiliation}."
    )

message
#'Hi Eric. You are a comedian. You were in Monty Python.'

parenthesis () 제외하고 문장 뒤에 \ 붙여도 가능

name = "Eric"
profession = "comedian"
affiliation = "Monty Python"
message = f"Hi {name}. " \
          f"You are a {profession}. " \
          f"You were in {affiliation}."

message
#'Hi Eric. You are a comedian. You were in Monty Python.'

참고 - https://realpython.com/python-f-strings/

참고 - https://docs.python.org/3/reference/lexical_analysis.html#f-strings

[문자열 포맷팅] %-formatting method

# syntax
print("%s%s" % (변수1, 변수2))
"%s%s" % (변수1, 변수2)

나이 = 28
직업 = "학생"
print("나는 %s살이고 %s이다." % (나이, 직업))
# 나는 28살이고 학생이다.

단점: 변수가 많아지면 구분하기 힘들어짐

first_name = "Eric"
last_name = "Idle"
age = 74
profession = "comedian"
affiliation = "Monty Python"
"Hello, %s %s. You are %s. You are a %s. You were a member of %s." % (first_name, last_name, age, profession, affiliation)
# 'Hello, Eric Idle. You are 74. You are a comedian. You were a member of Monty Python.'

[문자열 포맷팅] `format()` method

# syntax
print("string{}string{}string".format(var1, var2))
"string{}string{}string".format(var1, var2)

나이 = 28
직업 = "학생"
print("나는 {}살이고 {}이다.".format(나이, 직업))
# 나는 28살이고 학생이다.

braces {}안에 parameter로 index 입력해서 순서 변경 가능

# syntax
print("string{1}string{0}string".format(var1, var2))
"string{1}string{0}string".format(var1, var2)

print("나는 {1}살이고 {0}이다.".format(직업, 나이))
# 나는 28살이고 학생이다.

dictionary 포맷의 컨테이너 자료 출력 가능

# syntax
var = {'key1': 'string', 'key2': 0}
"string{parameter1}string{parameter2}string".format(parameter1=var['key1'], parameter2=var['key2'])

person = {'직업': 'Eric', 'age': 74}
"Hello, {name}. You are {age}.".format(name=person['name'], age=person['age'])
# 'Hello, Eric. You are 74.'

# syntax using **
var = {'key1': 'str', 'key2': 0}
"string{key1}string{key2}string".format(**var)

person = {'name': 'Eric', 'age': 74}
"Hello, {name}. You are {age}.".format(**person)
# 'Hello, Eric. You are 74.'

단점: 변수가 많아지면 구분하기 힘들어짐

first_name = "Eric"
last_name = "Idle"
age = 74
profession = "comedian"
affiliation = "Monty Python"
print(("Hello, {first_name} {last_name}. You are {age}. " + 
	"You are a {profession}. You were a member of {affiliation}.") \
	.format(first_name=first_name, last_name=last_name, age=age, \
					profession=profession, affiliation=affiliation))
# 'Hello, Eric Idle. You are 74. You are a comedian. You were a member of Monty Python.'

[문자열 교체 (replace string)] `replace()` method

# syntax
variableName.replace("FIND", "REPLACE")

korean = "가 나 다 라 마 바 사"

x = korean.replace(" ","-")
x

# '가-나-다-라-마-바-사'

[문자 자수 확인 (count string characters)] `len()` method

a = 'world'
len(a)
# 5

list의 value 개수 카운트

characters = ["사고뭉치 짱구", "시크한 철수", "화난유리", "코맹맹이 맹구", "쫄보 훈이"]
len(char)
# 5

list의 value당 글자수 카운트 (공백 포함)

characters = ["사고뭉치 짱구", "시크한 철수", "화난유리", "코맹맹이 맹구", "쫄보 훈이"]

for i in characters:
    print(len(i))

# 7
# 6
# 4
# 7
# 5

[문자 자수 확인(count string characters)] `count()` method

parameter에 따옴표 2개만 넣으면 0부터 글자수 카운트 (글자수+1개로 연산)

# syntax
문자.count("substring(글자수를 찾을 글자)", [시작점, 끝점])
# substring: 필수 parameter
# 시작점, 끝점: 필수 parameter 아님

a = 'world'
a.count("")
# 6

a = 'world'
a.count("",1)
# 5

list의 value당 자수 카운트

char = ["사고뭉치 짱구", "시크한 철수", "화난유리", "코맹맹이 맹구", "쫄보 훈이"]
phase = 0

for i in char:
    print(char[phase].count("구"))
    phase += 1

# 1
# 0
# 0
# 1
# 0

참고 - https://www.programiz.com/python-programming/methods/string/count

불형 (boolean)

bool : 불린/불리안(boolean)
- 참(True) 과 거짓(False) 둘 중 하나
- 가정문/반복문에 비교연산, 논리연산의 결과로 자주 사용

isTrue = True
if isTrue:
	print("It is true.")
else:
	print("It is false.")

비교 연산자 (comparison operators) - 의문형 부등호
- x > y : x가 y보다 크면
- x < y : x가 y보다 작으면
- x >= y : x가 y보다 크거나 같으면
- x <= y : x가 y보다 작거나 같으면
- x == y : x와 y가 같으면
- x != y : x와 y가 다르면
- 참고 - https://www.w3schools.com/python/gloss_python_comparison_operators.asp

논리 연산자 (logical operators) - 접속사

and : x,y 둘 다 해당해야 결과값 반환 (교집합)

True and False
# False

or : x,y 둘 중 하나만 해당해도 결과값 반환 (합집합)

True or False
# True

not : x 이외에 해당되는 결과값 (여집합)

not True
# False

컨테이너 자료형

컨테이너 자료형의 주 기능: CRUD
- create : 선언
- read : 조회
- update : 수정 (추가, 삽입, 확장)
- delete : 삭제

리스트형 (list)

# create
listName = ["data0", "data1", "data2", "data3"] # 변수 나열 방식, var 처럼 보통 코드 윗줄에 정의함
listName = ['data0', 'data1', 'data2', 'data3'] # 큰따옴표, 작은따옴표 구분 없음
listName # print 함수 없이 list 이름만 선언해서 출력 가능
# ['data0', 'data1', 'data2', 'data3']

# read - indexing
listName[index] # syntax

blackPink = ["Jennie", "Jisu", "Rosé", "Lisa"]
blackPink[0]
# 'Jennie'

# update
listName[index] = "newData" # syntax

NewJeans = ["Minji", "Hanni", "Daniel", "Haerin", "Hyein"]
NewJeans[2] = "Danielle"
NewJeans
# ['Minji', 'Hanni', 'Danielle', 'Haerin', 'Hyein']

# delete
del listName[0] # syntax

thislist = ["apple", "banana", "cherry"]
del thislist[0]
print(thislist)
# ['banana', 'cherry']


# delete - parameter 없으면 list 자체를 삭제
del listName # syntax

thisList = ["apple", "banana", "cherry"]
del thisList
thisList
# NameError: name 'thisList' is not defined

리스트 슬라이싱 (list slicing)

x = ["a", "b", "c", "d", "e"]
x[0:3]
# ['a', 'b', 'c']

리스트 안에 리스트 인덱싱 (list indexing)

x = ["a", "b", "c", ["d", "e"]]
x[3][1]
# 'e'

list update 관련 함수 - append(), insert(), extend(),sort()

# append() method
listName.append("newData") # syntax

NewJeans = ["Minji", "Hanni", "Danielle", "Haerin"]
NewJeans.append("Hyein")
NewJeans
# ['Minji', 'Hanni', 'Danielle', 'Haerin', 'Hyein']


# insert() method
listName.insert(index, "newData") # syntax

NewJeans = ["Minji", "Hanni", "Haerin", "Hyein"]
NewJeans.insert(2, "Danielle")
NewJeans
# ['Minji', 'Hanni', 'Danielle', 'Haerin', 'Hyein']


# extend() method
firstList.extend(secondList) # syntax
# 반환 가능한 객체 (iterable objects; tuples, sets, dictionaries) 모두 확장 가능

thislist = ["apple", "banana", "cherry"]
tropical = ["mango", "pineapple", "papaya"]
thislist.extend(tropical)
print(thislist)
# ['apple', 'banana', 'cherry', 'mango', 'pineapple', 'papaya']


# sort() method
numbers = [5, 1, 3, 9, 2]

numbers.sort()
print(numbers)
# [1, 2, 3, 5, 9]

numbers.sort(reverse=True)
print(numbers)
# [9, 5, 3, 2, 1]

참고 - https://www.w3schools.com/python/python_lists_add.asp

list delete 관련 함수 - remove(), pop(), clear()

# remove() method
listName.remove("데이터0") # syntax

SNSD = ["윤아", "태연", "수영", "제시카", "서현", "효연", "티파니", "유리", "써니"]
SNSD.remove("제시카")
SNSD
# ['윤아', '태연', '수영', '서현', '효연', '티파니', '유리', '써니']


# pop() method
listName.pop(0) # syntax
listName.pop() # parameter 없으면 마지막 data 삭제

thislist = ["apple", "banana", "cherry"]
thislist.pop(1)
print(thislist)
# ['apple', 'cherry']

thislist = ["apple", "banana", "cherry"]
thislist.pop()
print(thislist)
# ['apple', 'banana']


# clear() method
thislist.clear() # syntax
# list의 데이터만 삭제할 뿐, list는 남아있음

thislist = ["apple", "banana", "cherry"]
thislist.clear()
print(thislist)
# []

참고 - https://www.w3schools.com/python/python_lists_remove.asp

딕셔너리형 (dict)

# create
dictName = {"key1": "value1", "key2": "value2"} # 사전(dictionary)처럼 key 를 정의함
dictName # print 함수 없이 dict 이름만 선언해서 출력 가능
# {'key1': 'value1', 'key2': 'value2'}


# read
dictName["key"] # syntax
# list와 달리 hash형 자료이기 때문에 index로 조회 불가 (list보다 검색속도가 훨씬 더 빠름)

blackPink = {"rap1": "Jennie", "vocal1": "Jisu", "vocal2": "Rosé", "rap2": "Lisa"}
blackPink["rap1"]
# 'Jennie'


# update
dictName["existing key"] = "newData" # syntax, 이전 key value 선언

NewJeans = {"vocal1": "Minji", "vocal2": "Hanni", "vocal3": "Daniel", "vocal4": "Haerin", "vocal5": "Hyein"}
NewJeans["vocal3"] = "Danielle"
NewJeans
# {'vocal1': 'Minji',
#  'vocal2': 'Hanni',
#  'vocal3': 'Danielle',
#  'vocal4': 'Haerin',
#  'vocal5': 'Hyein'}


# append
dictName["new key"] = "newData" # syntax, 새로운 key value 선언

NewJeans = {"vocal1": "Minji", "vocal2": "Hanni", "vocal3": "Danielle", "vocal4": "Haerin"}
NewJeans["vocal5"] = "Hyein"
NewJeans
# {'vocal1': 'Minji',
#  'vocal2': 'Hanni',
#  'vocal3': 'Danielle',
#  'vocal4': 'Haerin',
#  'vocal5': 'Hyein'}


# delete
del dictName['key'] # syntax

thislist = {'fruit1': "apple", 'fruit2': "banana", 'fruit3': "cherry"}
del thislist['fruit1']
print(thislist)
# {'fruit2': 'banana', 'fruit3': 'cherry'}


# delete list
del dictName # syntax
# parameter 없으면 dict 자체를 삭제

thisDict = {'fruit1': "apple", 'fruit2': "banana", 'fruit3': "cherry"}
del thisDict
thisDict
# NameError: name 'thisDict' is not defined

dictionary 관련 함수 - get(), keys(), values(), items(), update()

# get() method
# 지정된 키(key)의 값(value) 반환
# 키가 존재하지 않을 경우 기본값(default value) 반환

student_scores = {"Alice": 90, "Bob": 85, "Charlie": 78}

alice_score = student_scores.get("Alice")
print(alice_score)
# 90

david_score = student_scores.get("David", 0)
print(david_score)
# 0


#keys() method
# 모든 키를 리스트로 반환
fruits = {"apple": "사과", "banana": "바나나", "cherry": "체리"}

fruit_keys = fruits.keys()
print("과일 키:", list(fruit_keys))
# ['apple', 'banana', 'cherry']


# values() method
# 모든 값을 리스트로 반환
fruits = {"apple": "사과", "banana": "바나나", "cherry": "체리"}

fruit_values = fruits.values()
print(list(fruit_values))
# ['사과', '바나나', '체리']


# items() method
# 모든 키와 값을 리스트로 반환
student_scores = {"Alice": 90, "Bob": 85, "Charlie": 78}

student_items = student_scores.items()
print(list(student_items))
# [('Alice', 90), ('Bob', 85), ('Charlie', 78)]


# update() method
# 딕셔너리에 다른 딕셔너리의 키와 값을 추가/수정
fruit_prices = {"apple": 1.0, "banana": 0.5, "cherry": 2.0}

new_prices = {"banana": 0.6, "grape": 1.5}
fruit_prices.update(new_prices)
print(fruit_prices)
# {'apple': 1.0, 'banana': 0.6, 'cherry': 2.0, 'grape': 1.5}

튜플형 (tuple)

한번 생성되면 요소를 변경(추가,삭제,수정)할 수 없음

t = (1, 2, 3)
t[0] = 1 # TypeError
t.append(4) # AttributeError

새로운 튜플 생성 후 자료 할당 가능

newt = t + (4,)
print(newt)
# (1, 2, 3, 4)

집합형 (set)

순서 없음
중복 데이터를 허용하지 않음

x = set('abbbbcccdde')
x
# {'a', 'b', 'c', 'd', 'e'}

저작자표시 비영리 변경금지 (새창열림)

'데이터 분석 부트캠프 > 주간학습일지' 카테고리의 다른 글

[패스트캠퍼스] 데이터 분석 부트캠프 16기 4주차 - Python의 Selenium을 활용한 데이터 크롤링 (6)	2024.09.13
[패스트캠퍼스] 데이터 분석 부트캠프 16기 3주차 - Python 제어문/예외처리/함수/변수/클래스 (7)	2024.09.05
[패스트캠퍼스] 데이터 분석 부트캠프 16기 2주차 - 기초수학/통계 (3)	2024.08.30
[패스트캠퍼스] 데이터 분석 부트캠프 16기 1주차 - EXCEL (1)	2024.08.23
패스트캠퍼스 데이터 분석 부트캠프 16기 OT (0)	2024.08.19

숫자형 (number formatting)

문자형 (string formatting)

[문자열 포맷팅 (string formatting)] f-strings method (formatted string literals)

[문자열 포맷팅] %-formatting method

[문자열 포맷팅] format() method

[문자열 교체 (replace string)] replace() method

[문자 자수 확인 (count string characters)] len() method

[문자 자수 확인(count string characters)] count() method

불형 (boolean)

컨테이너 자료형

튜플형 (tuple)

집합형 (set)

'데이터 분석 부트캠프 > 주간학습일지' 카테고리의 다른 글

티스토리툴바

[문자열 포맷팅] `format()` method

[문자열 교체 (replace string)] `replace()` method

[문자 자수 확인 (count string characters)] `len()` method

[문자 자수 확인(count string characters)] `count()` method